-
Because of the github file size limitation, I provide the dataset on "https://pan.baidu.com/s/1rOXuklyzkwHsK3TWCeEOow" which extraction code is "ej2v". Please first download the "total_data.pt" file and put it into the same level folder with lunwen.py. "total_data.pt" file can be generated by merge_data.py. If you download this file in advance, you can ignore executing merge_data.py
-
run lunwen.py which is the pretext task.
-
run text.py which is the downstream classification model.