Source code for t1modeler.com datasets preparation
The scripts in this repository faciliate the following tasks | 本代码仓库中的脚本完成以下任务:
- download data file from one of the various web pages | 从各种不同的数据页面中下载原始文件
- convert the data into pandas dataframe and binarize the target variable | 将文件中的数据转换为 pandas 数据集并创建目标变量
- save the dataframe as CSV file which is ready for modeling on t1modeler.com | 将数据集保存为 CSV 文件,压缩后可上传至 t1modeler.com 进行模型开发
Find the source page for each script in the table below | 表格内容为脚本与数据页面的对应关系
# | File Name | Source Page |
---|---|---|
1 | keel_001_kdd_cup_1999.py | Link |
2 | keel_002_sonar_mines_vs_rocks.py | Link |
3 | keel_003_molecular_biology.py | Link |
4 | keel_004_connect_4.py | Link |
5 | uci_001_adult_data_set.py | Link |
6 | uci_002_bank_marketing.py | Link |
7 | uci_003_human_activity_recognition.py | Link |
8 | uci_004_credit_approval.py | Link |
9 | uci_005_cylinder_bands.py | Link |
10 | uci_006_internet_advertisements.py | Link |
11 | uci_007_ionosphere.py | Link |
12 | uci_008_letter_recognition.py | Link |
13 | uci_009_multiple_features.py | Link |
14 | uci_010_mushroom.py | Link |
15 | uci_011_spambase.py | Link |
16 | uci_012_insurance_company_benchmark.py | Link |
17 | uci_013_german_credit_data.py | Link |
18 | uci_014_secom.py | Link |
19 | uci_015_qsar_biodegradation.py | Link |
20 | uci_016_seismic_bumps.py | Link |
21 | uci_017_thoracic_surgery_data.py | Link |
22 | uci_018_phishing_websites.py | Link |
23 | uci_019_default_of_credit_card_clients.py | Link |
24 | uci_020_sports_articles_objectivity.py | Link |
25 | uci_021_heart_disease.py | Link |
26 | uci_022_dermatology.py | Link |
27 | uci_023_madelon.py | Link |
28 | uci_024_ozone_level_detection.py | Link |
29 | uci_025_parkinsons.py | Link |
30 | uci_026_cardiotocography.py | Link |
31 | uci_027_miniboone_particle_identification.py | Link |
32 | uci_028_gas_sensor_array_drift.py | Link |
33 | uci_029_cnae_9.py | Link |
34 | uci_030_climate_model_simulation_crashes.py | Link |
35 | uci_031_eeg_eye_state.py | Link |
36 | uci_032_lsvt_voice_rehabilitation.py | Link |
37 | uci_033_urban_land_cover.py | Link |
38 | uci_034_diabetes_130_us_hospitals.py | Link |
39 | uci_035_gesture_phase_segmentation.py | Link |
40 | uci_036_student_performance.py | Link |
41 | uci_037_sensorless_drive_diagnosis.py | Link |
42 | uci_038_tv_news_channel_commercial_detection.py | Link |
43 | uci_039_diabetic_retinopathy_debrecen.py | Link |
44 | uci_040_online_news_popularity.py | Link |
45 | uci_041_mice_protein_expression.py | Link |
46 | uci_042_occupancy_detection.py | Link |
47 | uci_043_gas_sensors_for_home_activity.py | Link |
48 | uci_044_polish_companies_bankruptcy.py | Link |
49 | uci_045_htru2.py | Link |
50 | uci_046_cervical_cancer.py | Link |
51 | uci_047_epileptic_seizure_recognition.py | Link |
52 | uci_048_burst_header_packet.py | Link |
53 | uci_049_extention_of_z_alizadeh_sani.py | Link |
54 | uci_050_ida2016challenge.py | Link |
55 | uci_051_hcc_survival.py | Link |
56 | uci_052_online_shoppers_purchasing_intention.py | Link |
57 | uci_053_electrical_grid_stability.py | Link |
58 | uci_054_caesarian_section_classification.py | Link |
59 | uci_055_audit_data.py | Link |
60 | uci_056_hepatitis_c_virus.py | Link |
61 | uci_057_glass_identification.py | Link |
62 | uci_058_iris.py | Link |
63 | uci_059_optical_recognition_of_handwritten_digits.py | Link |
64 | vanderbilt_001_titanic.py | Link |
65 | vanderbilt_002_acute_bacterial_meningitis.py | Link |
66 | vanderbilt_003_ari_dataset.py | Link |
67 | vanderbilt_004_duchenne_muscular_dystrophy.py | Link |
68 | vanderbilt_005_right_heart_catheterization.py | Link |
69 | vanderbilt_006_ucla_stress_echocardiography.py | Link |
70 | vanderbilt_007_support_study.py | Link |
71 | vanderbilt_008_very_low_birth_weight_infants.py | Link |