-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Too many saithrift servers in SAI repo #1879
Comments
@chrispsommers fyi, can you have some comments maybe on this ? @rlhui @lguohan |
|
@kcudnik re:
|
@kcudnik re:
I can't recall if this is a workaround to avoid breaking DASH-specific PTF or SAI Challenger test cases, it might be. Seems like it should be filed in DASH as an issue. We might want to get @reshmaintel or @andriy-kokhan to comment. |
@chrispsommers , I think saithrift is still referred in some places in sonic-buildimage. The community may want to clean it up first even if it's not actually involved in the build process. |
I do not recall either SAI-C TCs that retrieve switch/port attributes or SAI-C workarounds for that. Usually, |
i noticed that failure when returning actual sai not implemented in DASH sai implementation, let me bring that locally and i will post here which tests are failing, i don't remember whether that was particular testcase or sithrift server itself |
@andriy-kokhan ok, when i run ptf tests, and return actually SAI_STATUS_NOT_SUPPORTED(-2) on not supported attributes i get this error $ make run-saithrift-ptftests
./disable_veth_ipv6.sh
Disable IPv6 to suppress Linux control plane noise
...
docker run -u root --network=host --rm -w /tests/ -it \
-w /tests/ \
local/dash-saithrift-client:latest \
./functional/ptf/run-tests.sh
+ ptf --test-dir ./functional/ptf --pypath /SAI/ptf --interface 0@veth1 --interface 1@veth3 --test-case-timeout=1800 '--test-params=connection='\''tcp'\'';target='\''bmv2'\'';traffic_check='\''bidir'\'';'
/usr/local/lib/python3.8/dist-packages/ptf-0.9.3-py3.8.egg/EGG-INFO/scripts/ptf:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
Using packet manipulation module: ptf.packet_scapy
saidashacl.SaiThriftDashAclTest ... Reboot mode is: cold
Config_db.json uses default path /SAI/ptf/config/../resources/config_db.json
port_config.ini uses default path /SAI/ptf/config/../resources/port_config.ini
Ignore all the expect error code and exception captures.
Get number_of_system_ports 0
Get number_of_active_ports 2
Restore all the expect error code and exception captures.
SkipTest on expected error. sai_thrift_get_switch_attribute with errorcode: -2 error: sai_thrift_exception(status=-2)
skipped 'SkipTest on expected error. sai_thrift_get_switch_attribute with errorcode: -2 error: sai_thrift_exception(status=-2)'
----------------------------------------------------------------------
Ran 1 test in 0.015s
OK (skipped=1)
saidasheni.CreateDeleteEniTest ... Reboot mode is: cold
Config_db.json uses default path /SAI/ptf/config/../resources/config_db.json
port_config.ini uses default path /SAI/ptf/config/../resources/port_config.ini and run tests hangs in this position, seems indefinitely this is the error:
when i return dummy success, the flow is like this:
|
i don't know which function sai_thrift_get_switch_attribute is throwing exception, the one in saithrift cpp or adapter.py, but the test should be able to handle that |
this is saithrift current implementation: 14715 void sai_thrift_get_port_attribute(sai_thrift_attribute_list_t& attr_list_out, const sai_thrift_object_id_t port_oid, const sai_thrift_attribute_list_t& attr_list) {
...
14749 status = port_api->get_port_attribute((sai_object_id_t)port_oid, attr_count, sai_attr_list);
14750 if (status != SAI_STATUS_SUCCESS) {
14751 sai_thrift_exception e;
14752 e.status = status;
14753 throw e;
14754 } it throws exception on any non success api |
in sai_adapter.py: (generated adapter) 24 # Expected error code
25 # Used with SKIP_TEST_ON_EXPECTED_ERROR
26 # For some expected errors in test
27 # Like SAI_STATUS_NOT_SUPPORTED = -2
28 EXPECTED_ERROR_CODE = [-2]
...
48408 def sai_thrift_get_switch_attribute(client,
48409 number_of_active_ports=None,
...
49652 attr_list = client.sai_thrift_get_switch_attribute(attr_list)
49653 except sai_thrift_exception as e:
49654 status = e.status
49655 if SKIP_TEST_ON_EXPECTED_ERROR and status in EXPECTED_ERROR_CODE:
49656 reason = "SkipTest on expected error. sai_thrift_get_switch_attribute with errorcode: {} error: {}".format(
49657 status, e)
49658 print(reason)
49659 testutils.skipped_test_count=1
49660 raise SkipTest(reason)
49661 if CATCH_EXCEPTIONS:
49662 return None
49663 else:
49664 raise e
49665 even if error is in expected error code, you will still get raise exception SkipTest, the problem here is that python SAI apis don't return error code, they return attribute, and exception in case of error, so the actual tests would need to be modified to catch exception if not supported attribute is queried in case of ptf tests, this is attribute 11 which is SAI_PORT_ATTR_SUPPORTED_FEC_MODE_EXTENDED, and after that maybe some other attributes ?
|
problem lies in this function DASH/dash-pipeline/SAI/SAI/ptf/sai_base_test.py 829 def saveNumberOfAvaiableResources(self, debug=False):
830 """
831 Save number of available resources
832 This allows to verify if all the test objects were removed
833
834 Args:
835 debug (bool): enables debug option
836 Return:
837 dict: switch_resources dictionary with available resources
838 """
839
840
841 switch_resources = sai_thrift_get_switch_attribute(
842 self.client,
843 available_ipv4_route_entry=True,
844 available_ipv6_route_entry=True,
845 available_ipv4_nexthop_entry=True,
846 available_ipv6_nexthop_entry=True,
847 available_ipv4_neighbor_entry=True,
848 available_ipv6_neighbor_entry=True,
849 available_next_hop_group_entry=True,
850 available_next_hop_group_member_entry=True,
851 available_fdb_entry=True,
852 available_ipmc_entry=True,
853 available_snat_entry=True,
854 available_dnat_entry=True,
855 available_double_nat_entry=True,
856 number_of_ecmp_groups=True,
857 ecmp_members=True)
858
859
860 if debug:
861 self.printNumberOfAvaiableResources(switch_resources)
862
863 return switch_resources it throws exception, since so far this was returning success, but when we actually return not supported, this throws exception and actual test is not executed at all |
all this is a little bit convoluted, with that previous catch skiptest #1879 (comment), i think this should all be revisited, since single not supported on availability resource should not be deal breaker on the test, but i have too little knowledge to figure out how to properly fix this from my understanding that exception catch should be inside each test separately, instated of sai_adapter.py fyl @lguohan |
@lguohan @andriy-kokhan @chrispsommers i added info warning message that we will remove bm dir at the end of 2023 #1889 |
Adding @ravi861 to help with DASH SAI PTF exception. Thanks. |
@kcudnik , I recalled, we keep using the global variable, gSwitchId, cause saiserver might need to be used within and integrated with syncd container(saiserver v1 also supports that), like read and write some output with syncd manner, but also support rpc invocations. |
@kcudnik like syncd output the sairedis.log, sai test cases also generated the logs with method name, parameters, and return values, i cannot recall if the exception will be locked, we can check on that. |
ok, sounds good |
While playing with DASH an SAI repos i noticed that there are too many saithrift servers, and each one is slightly different then each other:
then DASH utilizes saithriftv2 from SAI repo option 3, should we make and maintain only one saithrift server and deprecate and remove others ? we should deprecate 1,2 and 4 i think, and also why gensairpc.pl is generating sai_rpc_server.cpp but this file is not used ? - it is used, i overlooked that, it's included as cpp file in sai_rpc_frontend.cpp (agrh!)
any ideas?
also i would like to get rid of global and external linking to gSwitchId and gPortMap, those are global variables in frontends needed by saithrit when is used as a library for example in syncd, normally those variables are declared in thrift server main, but in case of syncd they are implemented in syncd itself, can saithrift figure out gSwitchId when user is calling create_switch, and PortMap from get PORT_LIST attribute on switch, and then attributes on specific ports if needed?
btw. that SAI/meta/sai_rpc_frontent.cpp have bugs, for example when it passes attributes and it encounters unknow attribute type, it just ignoring it, without any warning or error, and user has no idea whether this attribute was passed correctly or not, this is in attribute conversion thrift/sai here: https://github.com/opencomputeproject/SAI/blob/master/meta/sai_rpc_frontend.cpp#L512 and https://github.com/opencomputeproject/SAI/blob/master/meta/sai_rpc_frontend.cpp#L822 (at least those 2 obvious ones)
also there is one of the problems in DASH repo libsai.so which returns SUCCESS on failed get port/switch attribute, and saithrift or ptf tests are not able to handle that and fails, which is wrong behavior, it should be able to handle error message SAI_STATUS_NOT_SUPPORTED, this depends to it's own issue, but at this point i don't know whether this should be in DASH saithrift, SAI test/saithriftv2 or SAI/test/ptf tests, or DASH/dash-pipeline tests
The text was updated successfully, but these errors were encountered: