Skip to content

APISpecGen: Generating API Specifications for Bug Detection via Specification Propagation Analysis (NDSS 2025)

License

Notifications You must be signed in to change notification settings

Yuuoniy/APISpecGen

Repository files navigation

Intro

APISpecGen Icon

We introduce API Specification Propagation, a concept where API specifications propagate through hierarchical API call chains. With this idea, we can use given seed specifications to perform propagation analysis and identify related predecessors or successors and generate specifications for them.

To implement this, we design APISpecGen, a framework that generates new API specifications from seed specifications. APISpecGen consists of two main modules:

  • Specification Generation: Generates new specifications based on the initial seed specifications.
  • Bug Detection: Uses the generated specifications to detect violations within the codebase, identifying potential bugs.

APISpecGen focuses on API Post-handing specifcations, where each specification is represented as a three-part tuple: <target API, post-operation, critical variable>, meaning that after invoking the target API, the post-operation should be applied to the critical variable.

For more details, you may refer to the paper "Uncovering iceberg from the tip: Generating API Specifications for Bug Detection via Specification Propagation Analysis" (NDSS 2025).

Example workflow

The following figure illustrates a working example of APISpecGen. In this example, APISpecGen starts with the seed specification <get_device, put_device, arg1> and generates a new specification <nfc_get_device, nfc_put_device, retval>. This new specification is then applied to detect violations in the function nfc_genl_vendor_cmd.

Specifically, the specification generation module generates new specifications, include details such as propagation paths. Here’s an example of a generated specification:

{
  "API": "nfc_get_device", // the inferred API
  "SecOp": "nfc_put_device",// the inferred post-operation for the API     
  "usageCount": 21, // the usage times that follows the specifcation in the programs
  "depth": 2, // the propogation depth 
  "API_path": "get_device->class_find_device->nfc_get_device", // The API's propogation chain 
  "var_path": "arg->retval->retval" // the critical variable's propogation chain 
}

For additional generated data, you may refer to SpecGeneration/Data/ReferenceData.

The bug detection module identifies potential bugs based on detected specification violations. Each bug report contains the buggy function and the violated specification. Here’s an example bug report:

repo_name,buggy_func,main_api,sec_op,var_type,var
kernel,nfc_genl_vendor_cmd,nfc_get_device,nfc_put_device,retval,dev

For more data, you may refer to BugDetection/ReferenceData

Structure

Directory/File Description
SpecGeneration Specification Generation Module of APISpecGen
├─ QuickMode Source code for specification generation
├─ utils Utility scripts used in the source code
└─ Data Data for generated specifications
   ├──ReferenceData Reference specifications generated by APISpecGen
   └─ GeneratedSpec Specifications generated during runtime
BugDetection Bug Detection Module of APISpecGen
├─ modules Source code for bug detection
├─ ReferenceData Data related to bugs and their corresponding specifications
└─ utils Utility scripts used in the source code
APIAritifactEval Scripts and data for evaluating API artifact utilization
ComparedWithRelatedWork Specification data from related work
script Scripts for automating artifact evaluation
config.cfg Configuration file with essential project settings and paths
INSTALL.md Installation and setup instructions
Doc_For_Aritifact_Evaluation.md Step-by-step guide for artifact evaluation

Install

APISpecGen is implemented in Python and leverages existing code analysis tools, including Joern, Tree-sitter, and Weggli. We provide a Dockerfile to automate the setup and creation of the Docker environment. This Dockerfile includes everything needed to configure the runtime environment, install third-party tools and dependencies, set up Python libraries (as specified in requirements.txt), and download the source code for test programs (e.g., linux-5.16).

You can quickly build the Docker environment using the following command:

wget https://github.com/Yuuoniy/APISpecGen/raw/refs/heads/main/Dockerfile 
docker build -t apispecgen:latest .
docker run -it --name "apispecgen" "apispecgen:latest"

You may also refer to INSTALL

Step-by-Step Instructions

We provide step-by-step instruction for reproducing APISpecGen. For more information about the evaluation process, please refer to Doc_For_Aritifact_Evaluation

The key experiments are as follows:

Experiemnt Command Description Results
Minimal Running Example - minimal running example for quick test. This will reproduced the working example displayed in the paper.
├─Quick Test for Specifcation Generation ./script/0.quick_spec_generate.sh Perform specfication progation analysis for a seed with iteration set to 2. The generated specification saved to SpecGeneration/Data/GeneratedSpecs
└─Quick Test for Bug Detection ./script/0.1.quick_bug_detection.sh Perform quick test for bug detection using one generated specification The script prints out the detected potential bugs.
Specifcation Generation ./script/1.specification_generation.sh Generate specifcations use the given six seed specifcations. The generated specification saved to SpecGeneration/Data/GeneratedSpec.
Bug Detection ./script/2.bug_detection.sh Use generated specifcations to detect new bugs in the Linux kernel. the bug reports will be continuously logged into the file BugDetection/data/bug_report.csv.
Utilizebility of API Aritifacts ./script/3.API_aritifact_analysis.sh Use the generated specifications to evaluate the usability of API artifacts This scripts print out the analysis data, which reveals that API artifacts have significant limitations in specification extraction.

Extension

You can extend APISpecGen to generate specifications for new seed APIs or new projects.

For new seed APIs, you can either use tools like APHP to automatically extract seed specifications or manually create them in the tuple format <target API, post-operation, critical variable>.

To test on a new project, update the config.cfg file by adding the source code directories in the [URL] section. Use the following format:

[URL]
repo_name = {directory_of_repo_source_code}
#Example
openssl = ${Common:SOURCEDIR}/openssl
FFmpeg = ${Common:SOURCEDIR}/FFmpeg

Use the following command to generate specifications:

python SpecGeneration/spec_generate.py --seedAPI {TargetAPI} --seedSecOp {post-operation} --critical_var {critical-variable} --max_depth {max_iterations_for_analysis} --repo_name {repo_name}
#Examples:
python SpecGeneration/spec_generate.py --seedAPI av_malloc --seedSecOp av_free --critical_var retval --max_depth 5 --repo_name FFmpeg
python SpecGeneration/spec_generate.py --seedAPI BIO_new --seedSecOp BIO_free --critical_var retval --max_depth 5 --repo_name openssl

Our generated specifications are saved in the directory: SpecGeneration/Data/ReferenceData/OtherPrograms for reference.

About

APISpecGen: Generating API Specifications for Bug Detection via Specification Propagation Analysis (NDSS 2025)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published