-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathlocal-search.xml
33 lines (15 loc) · 59.3 KB
/
local-search.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title>GSoC-2021-ShengFengLu</title>
<link href="/2021/08/17/GSoC-2021-ShengFengLu/"/>
<url>/2021/08/17/GSoC-2021-ShengFengLu/</url>
<content type="html"><![CDATA[<h1 id="Replace-the-core-library-of-Quark-Engine"><a href="#Replace-the-core-library-of-Quark-Engine" class="headerlink" title="Replace the core library of Quark-Engine"></a>Replace the core library of Quark-Engine</h1><h2 id="Summary"><a href="#Summary" class="headerlink" title="Summary"></a>Summary</h2><p>My name is Sheng-Feng Lu. I am a fourth-year CS undergraduate student at <a href="https://www.nsysu.edu.tw/">NSYSU</a>, Taiwan. This summer, I participated in a <a href="https://summerofcode.withgoogle.com/">Google Summer of Code</a> project under the <a href="https://www.honeynet.org/">Honeynet Project</a> and I contributed to <a href="https://github.com/quark-engine/quark-engine">Quark-Engine</a>, an Android malware storyteller. The two main goals of my project are <strong>1. To rewrite the core library of Quark with <a href="https://github.com/rizinorg/rizin">Rizin</a></strong> and <strong>2. Evaluate and improve the performance of Quark Engine</strong>. </p><p>The following section is the summary of <span style="color:#FF3377"><strong>9</strong></span> important works I’ve done and their impacts. Also, works like <a href="#3-Implement-an-abstract-class-for-Quark-core-libraries">#3</a>, <a href="#4-Implement-Quark%E2%80%99s-new-core-library-with-Rizin">#4</a>, <a href="#5-Implement-more-bytecode-instructions-in-Quark%E2%80%99s-Dalvik-bytecode-loader">#5</a>, <a href="#6-Add-feature-Inheritance-class-check">#6</a>, and <a href="#8-Quark-Performance-Improvement">#8</a> need to be highlighted since they reached huge milestones for Quark Engine. </p><h3 id="1-Add-new-test-cases-and-improve-the-existing-ones-for-Quark-Engine"><a href="#1-Add-new-test-cases-and-improve-the-existing-ones-for-Quark-Engine" class="headerlink" title="1. Add new test cases and improve the existing ones for Quark-Engine"></a><strong>1. Add new test cases and improve the existing ones for Quark-Engine</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> <strong>96</strong> test cases are added and <strong>19</strong> test cases are improved.<br>– <span style="color:#FF3377"><em>Impact:</em></span> Great tests can protect the correctness of the program so that the original function will not be error-prone when refactoring or modifying.<br>– <span style="color:#FF3377"><em>Related Issue:</em></span> <strong><a href="https://github.com/quark-engine/gsoc2021-ShengFengLu/issues/1">Issue #1</a></strong><br>– <span style="color:#FF3377"><em>Related PRs:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/pull/189">PR #189</a></strong>, <strong><a href="https://github.com/quark-engine/quark-engine/pull/182">PR #182</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#1-Detail-Add-new-test-cases-and-improve-the-existing-ones-for-Quark-Engine">Go to details on this page</a>. </p><h3 id="2-Implement-a-more-comprehensive-call-graph-Behavior-Map"><a href="#2-Implement-a-more-comprehensive-call-graph-Behavior-Map" class="headerlink" title="2. Implement a more comprehensive call graph (Behavior Map)"></a><strong>2. Implement a more comprehensive call graph (Behavior Map)</strong></h3><p>– <span style="color:#FF3377"><em>Work #1:</em></span> Behavior Map is implemented.<br>– <span style="color:#FF3377"><em>Work #2:</em></span> Rule classification report is enhanced with information such as relationships between behaviors.<br>– <span style="color:#FF3377"><em>Impact:</em></span> By reading the behavior map, users can have an overview of the relationships between suspicious behaviors in the targeted APK file.<br>– <span style="color:#FF3377"><em>Related Issue:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/issues/191">Issue #191</a></strong><br>– <span style="color:#FF3377"><em>Related PR:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/pull/192">PR #192</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#2-Detail-Implement-a-more-comprehensive-call-graph-Behavior-Map">Go to details on this page</a>. </p><h3 id="3-Implement-an-abstract-class-for-Quark-core-libraries"><a href="#3-Implement-an-abstract-class-for-Quark-core-libraries" class="headerlink" title="3. Implement an abstract class for Quark core libraries"></a><strong>3. Implement an abstract class for Quark core libraries</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> Implement an <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/interface/baseapkinfo.py">abstract class</a> for Quark core libraries such as <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/apkinfo.py">apkinfo.py</a> and <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/rzapkinfo.py">rzapkinfo.py</a>.<br>– <span style="color:#FF3377"><em>Impact:</em></span> This helps to define essential methods of the Quark core library and reduces a lot of repetitive work in developing new core libraries.<br>– <span style="color:#FF3377"><em>Related PRs:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/pull/194">PR #194</a></strong>, <strong><a href="https://github.com/quark-engine/quark-engine/pull/197">PR #197</a></strong>, <strong><a href="https://github.com/quark-engine/quark-engine/pull/203">PR #203</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#3-Detail-Implement-an-abstract-class-for-Quark-core-libraries">Go to details on this page</a>. </p><h3 id="4-Implement-Quark’s-new-core-library-with-Rizin"><a href="#4-Implement-Quark’s-new-core-library-with-Rizin" class="headerlink" title="4. Implement Quark’s new core library with Rizin"></a><strong>4. Implement Quark’s new core library with Rizin</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> Implement Quark’s new core library (<a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/rzapkinfo.py">rzapkinfo.py</a>) with Rizin.<br>– <span style="color:#FF3377"><em>Impact:</em></span> <a href="https://github.com/androguard/androguard">Androguard</a> is one of the libraries used by Quark to analyze APKs, and it is rarely maintained now. Therefore, we decided to replace Androguard with Rizin. Rizin is a more active binary analysis framework that can help Quark maintain stability.<br>– <span style="color:#FF3377"><em>Related Issue:</em></span> <strong><a href="https://github.com/rizinorg/rizin/issues/1276">Issue #1276</a></strong><br>– <span style="color:#FF3377"><em>Related PRs:</em></span> <strong><a href="https://github.com/rizinorg/rizin/pull/1303">PR #1303</a>, <a href="https://github.com/quark-engine/quark-engine/pull/205">PR #205</a></strong>, <strong><a href="https://github.com/quark-engine/quark-engine/pull/209">PR #209</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#4-Detail-Implement-Quark%E2%80%99s-new-core-library-with-Rizin">Go to details on this page</a>. </p><h3 id="5-Implement-more-bytecode-instructions-in-Quark’s-Dalvik-bytecode-loader"><a href="#5-Implement-more-bytecode-instructions-in-Quark’s-Dalvik-bytecode-loader" class="headerlink" title="5. Implement more bytecode instructions in Quark’s Dalvik bytecode loader"></a><strong>5. Implement more bytecode instructions in Quark’s Dalvik bytecode loader</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> <strong>227</strong> bytecode instructions are newly implemented! Only <strong>19</strong> bytecode instructions were implemented before this project.<br>– <span style="color:#FF3377"><em>Impact:</em></span> More behaviors are detected since we implemented almost all bytecode instructions.<br>– <span style="color:#FF3377"><em>Related Issue:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/issues/131">Issue #131</a></strong><br>– <span style="color:#FF3377"><em>Related PR:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/pull/207">PR #207</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#5-Detail-Implement-more-bytecode-instructions-in-Quark%E2%80%99s-Dalvik-bytecode-loader">Go to details on this page</a>. </p><h3 id="6-Add-feature-Inheritance-class-check"><a href="#6-Add-feature-Inheritance-class-check" class="headerlink" title="6. Add feature: Inheritance class check"></a><strong>6. Add feature: Inheritance class check</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> We implement a feature that checks whether a class is inherited from another class.<br>– <span style="color:#FF3377"><em>Impact:</em></span> With this feature, Quark can detect inheritance methods usage.<br>– <span style="color:#FF3377"><em>Related PR:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/pull/210">PR #210</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#6-Detail-Add-feature-Inheritance-class-check">Go to details on this page</a>. </p><h3 id="7-Add-feature-API-parameters-check"><a href="#7-Add-feature-API-parameters-check" class="headerlink" title="7. Add feature: API parameters check"></a><strong>7. Add feature: API parameters check</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> We have implemented a feature that can support detecting the parameters of the Android API.<br>– <span style="color:#FF3377"><em>Impact:</em></span> With this feature implemented, Quark can now figure out if resources like SMS, call logs, or any other sensitive information is targeted.<br>– <span style="color:#FF3377"><em>Related PR:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/pull/212">PR #212</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#7-Detail-Add-feature-API-parameters-check">Go to details on this page</a>. </p><h3 id="8-Quark-Performance-Improvement"><a href="#8-Quark-Performance-Improvement" class="headerlink" title="8. Quark Performance Improvement"></a><strong>8. Quark Performance Improvement</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> Tools like <a href="https://www.brendangregg.com/flamegraphs.html">Flame Graph</a>, <a href="https://github.com/P403n1x87/austin">Austin</a>, <a href="https://github.com/sumerc/yappi">Yappi</a>, and <a href="https://github.com/zhuyifei1999/guppy3">Guppy3</a> were used to evaluate to performance of Quark. Two performance bottlenecks were found.<br>– <span style="color:#FF3377"><em>Impact:</em></span> With these two bottlenecks resolved, Rizin Core Library: CPU time saves up to <strong>59.20%</strong> on average, memory usage saves up to <strong>22.27%</strong> on average. Androguard Core Library: CPU time saved up to <strong>23.95%</strong> on average.<br>– <span style="color:#FF3377"><em>Related PRs:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/pull/231">PR #231</a></strong>, <strong><a href="https://github.com/quark-engine/quark-engine/pull/232">PR #232</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#8-Detail-Quark-Performance-Improvement">Go to details on this page</a>. </p><h3 id="9-Implement-parallelized-analysis"><a href="#9-Implement-parallelized-analysis" class="headerlink" title="9. Implement parallelized analysis"></a><strong>9. Implement parallelized analysis</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> We implemented parallelized techniques for Quark analysis.<br>– <span style="color:#FF3377"><em>Impact:</em></span> With this technique, the analysis performed on large APK is improved. The improvement is suitable to apply when users encounter speed issues on Quark.<br>– <span style="color:#FF3377"><em>Related PRs:</em></span> <strong><a href="https://github.com/quark-engine/quark-engine/pull/231">PR #231</a></strong>, <strong><a href="https://github.com/quark-engine/quark-engine/pull/232">PR #232</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#9-Detail-Implement-parallelized-analysis">Go to details on this page</a>. </p><hr><h2 id="Details"><a href="#Details" class="headerlink" title="Details"></a>Details</h2><h3 id="1-Detail-Add-new-test-cases-and-improve-the-existing-ones-for-Quark-Engine"><a href="#1-Detail-Add-new-test-cases-and-improve-the-existing-ones-for-Quark-Engine" class="headerlink" title="1. [Detail] Add new test cases and improve the existing ones for Quark-Engine"></a>1. [Detail] Add new test cases and improve the existing ones for Quark-Engine</h3><p>To improve existing test cases and develop a new Rizin-based core library in a test-driven development. I added several test cases based on two guidelines <strong>1. Boundary value analysis</strong> and <strong>2. Equivalence class partitioning</strong> to improve efficiency and have more comprehensive test cases.</p><p>A total of <strong>96</strong> test cases have been added and <strong>19</strong> existing test cases have been improved. Quark’s overall test coverage is now <strong>79%</strong> (an increase of <strong>10%</strong>).</p><p><span style="color:#157040"><em><strong>Relative Issue:</strong></em></span> </p><p>Gsoc2021-ShengFengLu <a href="https://github.com/quark-engine/gsoc2021-ShengFengLu/issues/1">Issue #1</a>: Enrich Quark tests </p><p><span style="color:#157040"><em><strong>Relative PRs:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/189">PR #189</a>: Refactor/enrich the rest of Quark’s tests<br>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/182">PR #182</a>: Refactor/enrich the tests of the analysis part of modules </p><p><a href="#1-Add-new-test-cases-and-improve-the-existing-ones-for-Quark-Engine">Go back to summary</a> </p><hr><h3 id="2-Detail-Implement-a-more-comprehensive-call-graph-Behavior-Map"><a href="#2-Detail-Implement-a-more-comprehensive-call-graph-Behavior-Map" class="headerlink" title="2. [Detail] Implement a more comprehensive call graph (Behavior Map)"></a>2. [Detail] Implement a more comprehensive call graph (Behavior Map)</h3><p>We are committed to providing users with more insightful information. Therefore, by developing a behavior map, users can have an overview of the relationship between suspicious behaviors of different functions in the target APK. Also, these relationships could be found in the rule classification report. Please see figure 2.1. and figure 2.2. for more details.</p><p><img src="https://i.imgur.com/r9EwLun.png"> </p><center>Figure 2.1. Behavior Map</center><br> <p><img src="https://i.imgur.com/rClsdbS.png"></p><center>Figure 2.2. Rule classification report </center><br> <p><span style="color:#157040"><em><strong>Relative Issue:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/issues/191">Issue #191</a>: Show Parent Functions’ Cross-References in Rule Classification </p><p><span style="color:#157040"><em><strong>Relative PR:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/192">PR #192</a>: Show Parent Functions’ Cross-References In Rule Classification </p><p><a href="#2-Implement-a-more-comprehensive-call-graph-Behavior-Map">Go back to summary</a> </p><hr><h3 id="3-Detail-Implement-an-abstract-class-for-Quark-core-libraries"><a href="#3-Detail-Implement-an-abstract-class-for-Quark-core-libraries" class="headerlink" title="3. [Detail] Implement an abstract class for Quark core libraries"></a>3. [Detail] Implement an abstract class for Quark core libraries</h3><p>To have the Rizin based core library co-exist with the Androguard based core library and to ensure the resilience of Quark, I implemented an abstract class for Quark core libraries. Figure 3.1. is a glance at the source code of the abstract class. Please go <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/interface/baseapkinfo.py">here</a> for the complete source code. </p><p><img src="https://i.imgur.com/OpM0Fmc.png"> </p><center>Figure 3.1. Source code of The abstract class</center><br> <p><span style="color:#157040"><em><strong>Relative PRs:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/194">PR #194</a>: Add A Standard Interface for Accessing APK Information<br>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/197">PR #197</a>: Clean Up the Direct Use of MethodAnalysis<br>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/203">PR #203</a>: Make Apkinfo implement the standard APK interface </p><p><a href="#3-Implement-an-abstract-class-for-Quark-core-libraries">Go back to summary</a> </p><hr><h3 id="4-Detail-Implement-Quark’s-new-core-library-with-Rizin"><a href="#4-Detail-Implement-Quark’s-new-core-library-with-Rizin" class="headerlink" title="4. [Detail] Implement Quark’s new core library with Rizin"></a>4. [Detail] Implement Quark’s new core library with Rizin</h3><p>Currently, many features in Quark-Engine rely on <a href="https://github.com/androguard/androguard">Androguard</a> for analysis. However, Androguard remained inactive in the past two years. Therefore, to keep Quark’s resilience, we choose <a href="https://github.com/rizinorg/rizin">Rizin</a>, a more active reverse engineering framework open source project, as Quark’s new core library.</p><p>During development, two issues were encountered while implementing the Rizin core library. <strong>1. The result of Xref is incorrect in some cases</strong> <strong>2. It doesn’t support multiple dex file analysis.</strong></p><p>We reported these two issues to the Rizin community. For the <a href="https://github.com/rizinorg/rizin/issues/1276">Xref issue</a>, the community proved that it is the bug of Rizin itself and they put this issue to be resolved. As for the multi-dex analysis issue, the community now has <a href="https://github.com/rizinorg/rizin/pull/1303">this problem fixed</a>. </p><p>Here is a <a href="https://i.imgur.com/Uc2dJPx.mp4">video</a> demonstrating a Quark analysis of the Rizin-based library. </p><p><span style="color:#157040"><em><strong>Relative Issue:</strong></em></span> </p><p>Rizin <a href="https://github.com/rizinorg/rizin/issues/1276">Issue #1276</a>: Missing method references in DEX </p><p><span style="color:#157040"><em><strong>Relative PRs:</strong></em></span> </p><p>Rizin <a href="https://github.com/rizinorg/rizin/pull/1303">PR #1303</a>: Load all dalvik files in an APK<br>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/205">PR #205</a>: Add a Rizin-based core library<br>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/209">PR #209</a>: Update travis.yml to set up Rizin </p><p><a href="#4-Implement-Quark%E2%80%99s-new-core-library-with-Rizin">Go back to summary</a> </p><hr><h3 id="5-Detail-Implement-more-bytecode-instructions-in-Quark’s-Dalvik-bytecode-loader"><a href="#5-Detail-Implement-more-bytecode-instructions-in-Quark’s-Dalvik-bytecode-loader" class="headerlink" title="5. [Detail] Implement more bytecode instructions in Quark’s Dalvik bytecode loader"></a>5. [Detail] Implement more bytecode instructions in Quark’s Dalvik bytecode loader</h3><p>In the previous version of Quark, the Dalvik bytecode loader supported only two types of bytecodes. The lack of bytecodes may cause false positives in tainted analysis.</p><p>Therefore, one of my goals is to improve accuracy by adding the missing <strong>227</strong> bytecodes to the loader. All added bytecodes are related to the data flow between the registers. Now the Dalvik bytecode loader supports bytecode types such as array/exception accessing, function calls, and arithmetic operations.</p><p>And surprisingly, this had improved the analysis accuracy. We use <a href="https://github.com/quark-engine/apk-malware-samples/blob/master/14d9f1a92dd984d6040cc41ed06e273e.apk">14d9f1a92dd984d6040cc41ed06e273e.apk</a> to test this implementation. Experiment results show the confidence of detected behavior increased <strong>20%</strong>. See Figure 5.1. and Figure 5.2. for more details. </p><p><img src="https://i.imgur.com/OVSPS1A.png"></p><center>Figure 5.1. 19 bytecode instructions implemented</center><br> <p><img src="https://i.imgur.com/PZUMOn1.png"></p><center>Figure 5.2. 227 bytecode instructions implemented</center><br> <p><span style="color:#157040"><em><strong>Relative Issue:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/issues/131">Issue #131</a>: Improve Quark’s tainted analysis accuracy in stage 5 </p><p><span style="color:#157040"><em><strong>Relative PR:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/207">PR #207</a>: Improve the tainted analysis of Quark </p><p><a href="#5-Implement-more-bytecode-instructions-in-Quark%E2%80%99s-Dalvik-bytecode-loader">Go back to summary</a> </p><hr><h3 id="6-Detail-Add-feature-Inheritance-class-check"><a href="#6-Detail-Add-feature-Inheritance-class-check" class="headerlink" title="6. [Detail] Add feature: Inheritance class check"></a>6. [Detail] Add feature: Inheritance class check</h3><p>Class inheritance checking is not supported by the previous versions of Quark. However, it is quite essential when detecting Android API usage.</p><p>Therefore, I implemented a patch to deal with this problem. Three things were implemented in this <a href="https://github.com/quark-engine/quark-engine/pull/210">patch</a>: </p><p>1. Track the data type of each register in the bytecode loader.<br>2. Construct a class inheritance relation dictionary in <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/evaluator/pyeval.py">pyeval.py</a>.<br>3. Add a lookup procedure to check the inheritance relationship to the loader. </p><p>And surprisingly, this had improved the analysis accuracy. We use <a href="https://github.com/quark-engine/apk-malware-samples/blob/master/14d9f1a92dd984d6040cc41ed06e273e.apk">14d9f1a92dd984d6040cc41ed06e273e.apk</a> to test this implementation. Experiment results show the confidence of detected behavior increased <strong>20%</strong>. See Figure 6.1. and Figure 6.2. for more details. </p><p><img src="https://i.imgur.com/zLvKhWR.png"></p><center>Figure 6.1. No inheritance class check </center><br> <p><img src="https://i.imgur.com/jMOAImJ.png"> </p><center>Figure 6.2. With inheritance class check</center><br> <p><span style="color:#157040"><em><strong>Relative PR:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/210">PR #210</a>: Enhance the Simulation Accuracy of the Bytecode Loader </p><p><a href="#6-Add-feature-Inheritance-class-check">Go back to summary</a> </p><hr><h3 id="7-Detail-Add-feature-API-parameters-check"><a href="#7-Detail-Add-feature-API-parameters-check" class="headerlink" title="7. [Detail] Add feature: API parameters check"></a>7. [Detail] Add feature: API parameters check</h3><p>Many Android APIs represent different behaviors based on parameters. However, in the previous versions of Quark, it could not check Android API parameters, which always need manual checks when analyzing malware.</p><p>Therefore, by implementing this feature, Quark can now determine whether the malware gathers personal information such as SMS, call logs, or any other sensitive information.</p><p>This implementation lets users fill in keywords like targeted resources (SMS, Calllog) in the detection rule.</p><p>And surprisingly, this had improved the analysis accuracy. We use <a href="https://github.com/quark-engine/apk-malware-samples/blob/master/14d9f1a92dd984d6040cc41ed06e273e.apk">14d9f1a92dd984d6040cc41ed06e273e.apk</a> to test this implementation. Experiment results show that the API parameters check with the new rule. See Figure 7.1. and Figure 7.2. for more details. </p><p><img src="https://i.imgur.com/a4kkmph.png"></p><center>Figure 7.1. New detection rule</center><br> <p><img src="https://i.imgur.com/o7XgtTg.png"></p><center>Figure 7.2. Summary report With API parameters check</center><br> <p><span style="color:#157040"><em><strong>Relative PR:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/212">PR #212</a>: Add an Optional Parameter Filter For JSON Rules </p><p><a href="#7-Add-feature-API-parameters-check">Go back to summary</a> </p><hr><h3 id="8-Detail-Quark-Performance-Improvement"><a href="#8-Detail-Quark-Performance-Improvement" class="headerlink" title="8. [Detail] Quark Performance Improvement"></a>8. [Detail] Quark Performance Improvement</h3><p>According to user feedback, Quark encounters performance problems when analyzing large-size APKs. Therefore, we choose several performance evaluation tools such as Flame Graph, Austin, Yappi, and Guppy3 to find performance problems and solve them.</p><p>Our evaluation is quite simple, we take <strong>CPU time</strong> and <strong>memory usage</strong> as the indicators to evaluate the performance of Quark’s <strong>two core libraries</strong> (apkinfo.py, rzapkinfo.py). <strong>3</strong> sample APKs were chosen based on different file sizes to conduct the evaluation.</p><p><span style="color:#157040"><em><strong>8.1. Core Library based on Androguard</strong></em></span> </p><p><strong>CPU time - Androguard based core library</strong> </p><p>In this section, we use 3 sample APKs sizes ranged from 1MB, 4MB, and 10MB to evaluate the CPU time of the Androguard based core library. All 3 APKs are from the database of the University of Luxembourg, <a href="https://androzoo.uni.lu/">AndroZoo</a>, and flagged as malware by <a href="https://www.virustotal.com/">VirusTotal</a>. We use Flame Graph to generate 3 figures below. All figures point to one problem, that is, ==find_method()== in <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/apkinfo.py">apkinfo.py</a> occupied most of the CPU times. See Figure 8.1., Figure 8.2. and Figure 8.3. for more details. </p><p><img src="https://i.imgur.com/Ju4v6nK.png"></p><center>Figure 8.1. CPU Time: APK size 1 MB</center><br> <p><img src="https://i.imgur.com/NUhyFZG.png"></p><center>Figure 8.2. CPU Time: APK size 5.3 MB</center><br> <p><img src="https://i.imgur.com/dWafXc0.png"></p><center>Figure 8.3. CPU Time: APK size 10 MB</center><br> <p><strong>Memory Usage - Androguard based core library</strong> </p><p>In this section, the same sample APKs were used to evaluate the memory usage of the Androguard based core library. We use Flame Graph to generate 3 figures below. All figures point to one problem: most of the memory was consumed when parsing APKs with Androguard. See Figure 8.4., Figure 8.5. and Figure 8.6. for more details. </p><p><img src="https://i.imgur.com/1mfQB5d.png"></p><center>Figure 8.4. Memory Usage: APK size 1 MB</center><br> <p><img src="https://i.imgur.com/hDAycJC.png"></p><center>Figure 8.5. Memory Usage: APK size 5.3 MB</center><br> <p><img src="https://i.imgur.com/Iq7Ldp9.png"></p><center>Figure 8.6. Memory Usage: APK size 10 MB</center><br> <p><span style="color:#157040"><em><strong>8.2. Core Library based on Rizin</strong></em></span> </p><p><strong>CPU time - Rizin based core library</strong> </p><p>In this section, the same sample APKs were used to evaluate the CPU time of the Rizin based core library. We use Flame Graph to generate 3 figures below. All figures point to one problem, that is, repeated calculation of unchanging results. And we found out that 3 methods have this problem. They’re ==all_methods()==, ==permission()==, and ==class_hierarchy()== in <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/rzapkinfo.py">rzapkinfo.py</a>. See Figure 8.7., Figure 8.8. and Figure 8.9. for more details. </p><p><img src="https://i.imgur.com/koZyNFX.png"></p><center>Figure 8.7. CPU Time: APK size 1 MB</center><br> <p><img src="https://i.imgur.com/Ulpl8B6.png"></p><center>Figure 8.8. CPU Time: APK size 5.3 MB</center><br> <p><img src="https://i.imgur.com/g6sHi9J.png"></p><center>Figure 8.9. CPU Time: APK size 10 MB</center><br> <p><strong>Memory Usage - Rizin based core library</strong> </p><p>In this section, the same 3 sample APKs were used to evaluate the memory usage of Rizin based core library. We use Flame Graph to generate 3 figures below. All figures point to one problem, that is, duplicate Rizin processes. And we found out that 1 method has this problem. It is ==permission()== in rzapkinfo.py. Note that ==permission()== is invisible in the last figures since massive objects are created in the third-party method, ==cmdj()==.</p><p><img src="https://i.imgur.com/K2JPkvD.png"></p><center>Figure 8.10. Memory Usage: APK size 1 MB</center><br> <p><img src="https://i.imgur.com/sdDH2zE.png"></p><center>Figure 8.11. Memory Usage: APK size 5.3 MB</center><br> <p><img src="https://i.imgur.com/0HDcd6u.png"></p><center>Figure 8.12. Memory Usage: APK size 10 MB</center><br> <p><span style="color:#157040"><em><strong>8.3. Solutions to Problems found</strong></em></span> </p><p>Androguard based core library </p><ul><li>CPU Time: We implemented a <a href="https://github.com/quark-engine/quark-engine/pull/232">patch</a> to <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/apkinfo.py">apkinfo.py</a> to fix this problem. </li><li>Memory Usage: The solution, for now, is the Rizin core library. We arrange to help Androguard solving it in the future.</li></ul><p>Rizin based core library </p><ul><li>CPU Time: We implemented a <a href="https://github.com/quark-engine/quark-engine/pull/231">patch</a> to <a href="https://github.com/quark-engine/quark-engine/blob/master/quark/core/rzapkinfo.py">rzapkinfo.py</a> to this problem. </li><li>Memory Usage: This problem is solved by the same <a href="https://github.com/quark-engine/quark-engine/pull/231">patch</a> we submitted. </li></ul><p><span style="color:#157040"><em><strong>8.4. Performance Re-assessments</strong></em></span> </p><p><strong>Androguard based core library</strong><br>After implemented the <a href="https://github.com/quark-engine/quark-engine/pull/232">patch</a>, the CPU time saved up to <strong>23.95%</strong> on average. </p><table><thead><tr><th align="left">MD5</th><th align="right">Size<br>(MB)</th><th align="right">Before The Improvement<br/> (second)</th><th align="right">After The Improvement<br>(second)</th><th align="right">Percentage increase<br>(%)</th></tr></thead><tbody><tr><td align="left">9283c74dd8356c18bb6d94b88b8fdd9b</td><td align="right">1</td><td align="right">2.00</td><td align="right">1.58</td><td align="right">26%</td></tr><tr><td align="left">8e241d9a7a7cf8bf606b15a9f43af5fd</td><td align="right">5.3</td><td align="right">3.47</td><td align="right">2.91</td><td align="right">16.13%</td></tr><tr><td align="left">3edfc78ab53521942798ad551027d04f</td><td align="right">10</td><td align="right">60.53</td><td align="right">42.53</td><td align="right">29.73%</td></tr></tbody></table><center>Table 8.1. CPU Time on all APKs</center><br> <p><strong>Rizin based core library</strong><br>After implemented the <a href="https://github.com/quark-engine/quark-engine/pull/231">patch</a>, the CPU time saved up to <strong>59.20%</strong> on average. </p><table><thead><tr><th align="left">MD5</th><th align="right">Size<br>(MB)</th><th align="right">Before The Improvement<br/> (second)</th><th align="right">After The Improvement<br>(second)</th><th align="right">Percentage <br>(%)</th></tr></thead><tbody><tr><td align="left">9283c74dd8356c18bb6d94b88b8fdd9b</td><td align="right">1</td><td align="right">5.56</td><td align="right">1.24</td><td align="right">77.70%</td></tr><tr><td align="left">8e241d9a7a7cf8bf606b15a9f43af5fd</td><td align="right">5.3</td><td align="right">21.47</td><td align="right">12.60</td><td align="right">41.31%</td></tr><tr><td align="left">3edfc78ab53521942798ad551027d04f</td><td align="right">10</td><td align="right">1118.00</td><td align="right">462.90</td><td align="right">58.60%</td></tr></tbody></table><center>Table 8.2. CPU Time on all APKs</center><br> <p>After implemented the <a href="https://github.com/quark-engine/quark-engine/pull/231">patch</a>, the memory usage saved up to <strong>22.27%</strong> on average. </p><table><thead><tr><th>MD5</th><th align="right">Size (MB)</th><th align="right">Before The Improvement (MB)</th><th align="right">After The Improvement (MB)</th><th align="right">Percentage (%)</th></tr></thead><tbody><tr><td>9283c74dd8356c18bb6d94b88b8fdd9b</td><td align="right">1</td><td align="right">285.00</td><td align="right">211.76</td><td align="right">25.70%</td></tr><tr><td>8e241d9a7a7cf8bf606b15a9f43af5fd</td><td align="right">5.3</td><td align="right">278.19</td><td align="right">235.63</td><td align="right">15.30%</td></tr><tr><td>3edfc78ab53521942798ad551027d04f</td><td align="right">10</td><td align="right">1116.40</td><td align="right">828.11</td><td align="right">25.82%</td></tr></tbody></table><center>Table 8.3. Memory Usage on all APKs</center><br> <p><span style="color:#157040"><em><strong>Relative PRs:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/231">PR #231</a>: Cache commonly used attributes in the Rizin core library<br>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/232">PR #232</a>: Simplify the generator value check in the Androguard core library</p><p><a href="#8-Quark-Performance-Improvement">Go back to summary</a> </p><h3 id="9-Detail-Implement-parallelized-analysis"><a href="#9-Detail-Implement-parallelized-analysis" class="headerlink" title="9. [Detail] Implement parallelized analysis"></a>9. [Detail] Implement parallelized analysis</h3><p><strong>Androguard based core library</strong> </p><p>The improvement is significant on large APKs. However, the cost of process initialization makes it not obvious when analyzing small ones. I conclude that the improvement is suitable to apply when users encounter speed issues on Quark. See Table 9.1. for more details. </p><table><thead><tr><th>MD5</th><th align="right">Size<br>(MB)</th><th align="right">Before The Improvement<br/> (Second)</th><th align="right">After The Improvement<br>(Second)</th><th align="right">Percentage <br>(%)</th></tr></thead><tbody><tr><td>9283c74dd8356c18bb6d94b88b8fdd9b</td><td align="right">1</td><td align="right">2.90</td><td align="right">2.04</td><td align="right">29.65%</td></tr><tr><td>8e241d9a7a7cf8bf606b15a9f43af5fd</td><td align="right">5.3</td><td align="right">3.04</td><td align="right">4.10</td><td align="right">-34.87%</td></tr><tr><td>3edfc78ab53521942798ad551027d04f</td><td align="right">10</td><td align="right">69.32</td><td align="right">42.30</td><td align="right">38.98%</td></tr></tbody></table><center>Table 9.1. Androguard analysis time on all apks</center><br> <p><strong>Rizin based core library</strong> </p><p>The improvement is significant on large APKs. However, the cost of process initialization makes it not obvious when analyzing small ones. I conclude that the improvement is suitable to apply when users encounter speed issues on Quark. See Table 9.2. for more details. </p><table><thead><tr><th>MD5</th><th align="right">Size (MB)</th><th align="right">Before The Improvement (Second)</th><th align="right">After The Improvement (Second)</th><th align="right">Percentage (%)</th></tr></thead><tbody><tr><td>9283c74dd8356c18bb6d94b88b8fdd9b</td><td align="right">1</td><td align="right">1.24</td><td align="right">1.70</td><td align="right">-37.09%</td></tr><tr><td>8e241d9a7a7cf8bf606b15a9f43af5fd</td><td align="right">5.3</td><td align="right">21.47</td><td align="right">12.60</td><td align="right">41.31%</td></tr><tr><td>3edfc78ab53521942798ad551027d04f</td><td align="right">10</td><td align="right">462.90</td><td align="right">36.93</td><td align="right">92.02%</td></tr></tbody></table><center>Table 9.2. Rizin analysis time on all apks</center><br> <p><span style="color:#157040"><em><strong>Relative PRs:</strong></em></span> </p><p>Quark-Engine <a href="https://github.com/quark-engine/quark-engine/pull/223">PR #223</a>: Parallelize the analysis</p><p><a href="#9-Implement-parallelized-analysis">Go back to summary</a></p><hr><h2 id="What’s-Next"><a href="#What’s-Next" class="headerlink" title="What’s Next"></a>What’s Next</h2><ol><li>Keep assisting the Rizin community with the <a href="https://github.com/rizinorg/rizin/issues/1276">Xref issue</a> in APKs.</li><li>Address the Androguard memory problem we found and help Androguard to solve it.</li><li>Continuously improve the Rizin core library to make its performance better than the Androguard core library. </li></ol><h2 id="Thanks"><a href="#Thanks" class="headerlink" title="Thanks"></a>Thanks</h2><p>Thank my mentors, <a href="https://github.com/krnick">JunWei Song</a> and <a href="https://github.com/18z">KunYu Chen</a>, for their sincere support and guidance.<br>Thank <a href="https://github.com/androguard/androguard">Androguard</a> and <a href="https://github.com/rizinorg/rizin">Rizin</a> for helping me reach my goal.<br>Thank <a href="https://www.ttc.org.tw/Eng/">TTC</a> for providing me the working environment.<br>Thank <a href="https://www.honeynet.org/">Honeynet Project</a> for this great opportunity.<br>Thank Google for making all this happend!</p>]]></content>
</entry>
<entry>
<title>GSoC-2021-YuShiangDang</title>
<link href="/2021/08/17/GSoC-2021-YuShiangDang/"/>
<url>/2021/08/17/GSoC-2021-YuShiangDang/</url>
<content type="html"><![CDATA[<h1 id="New-Rule-Generation-Technique-amp-Make-Quark-Everywhere-Among-Security-Open-Source-Projects"><a href="#New-Rule-Generation-Technique-amp-Make-Quark-Everywhere-Among-Security-Open-Source-Projects" class="headerlink" title="New Rule Generation Technique & Make Quark Everywhere Among Security Open Source Projects"></a>New Rule Generation Technique & Make Quark Everywhere Among Security Open Source Projects</h1><h2 id="Summary"><a href="#Summary" class="headerlink" title="Summary"></a>Summary</h2><p>My name is YuShiang Dang. I am a third-year graduate student at <a href="https://eng.nkust.edu.tw/">NKUST</a>, Taiwan. This summer, I participated in <a href="https://summerofcode.withgoogle.com/">Google Summer of Code</a> project for <a href="https://www.honeynet.org/">Honeynet Project</a> to contribute to <a href="https://github.com/quark-engine/quark-engine">Quark-Engine</a>. Two main goals of my project are <strong>1. To boost up rule generation for Quark</strong> and <strong>2. Make Quark everywhere among open source projects.</strong></p><p>As for my project, I worked on multiple repositories, including <a href="https://github.com/quark-engine/quark-rule-generate">quark-rule-generate</a>, <a href="https://github.com/skylot/jadx">Jadx</a>, <a href="https://github.com/MobSF/Mobile-Security-Framework-MobSF">MobSF</a> and <a href="https://github.com/APKLab/APKLab">APKLab</a>.</p><p>The following section is the summary of <span style="color:#FF3377"><strong>7</strong></span> important works I’ve done and its impacts:</p><h3 id="1-First-Goal-Implement-a-new-rule-generate-technique-for-quark-rule-generate"><a href="#1-First-Goal-Implement-a-new-rule-generate-technique-for-quark-rule-generate" class="headerlink" title="1. First Goal - Implement a new rule generate technique for quark-rule-generate"></a><strong>1. First Goal - Implement a new rule generate technique for quark-rule-generate</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> The new rule generate technique is implemented.<br>– <span style="color:#FF3377"><em>Impact:</em></span> And is proved to be helpful finding important rules within a relatively short time.<br>– <span style="color:#FF3377"><em>Related PR:</em></span> <strong><a href="https://github.com/quark-engine/quark-rule-generate/pull/2">PR: #2</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#1-Implement-a-new-rule-generate-technique-for-quark-rule-generate">Go to details on this page</a>.</p><h3 id="2-First-Goal-Solve-CPU-idle-problem-for-quark-rule-generate"><a href="#2-First-Goal-Solve-CPU-idle-problem-for-quark-rule-generate" class="headerlink" title="2. First Goal - Solve CPU idle problem for quark-rule-generate"></a><strong>2. First Goal - Solve CPU idle problem for quark-rule-generate</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> A solution is proposed and is implmented for the CPU idle problem.<br>– <span style="color:#FF3377"><em>Impact:</em></span> The work has made huge improvement of the performace.<br>– <span style="color:#FF3377"><em>Related PR:</em></span> <strong><a href="https://github.com/quark-engine/quark-rule-generate/pull/3">PR: #3</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#2-Solve-CPU-idle-problem-for-quark-rule-generate">Go to details on this page</a>.</p><h3 id="3-Second-Goal-Improve-the-UX-for-using-Quark-in-Jadx"><a href="#3-Second-Goal-Improve-the-UX-for-using-Quark-in-Jadx" class="headerlink" title="3. Second Goal - Improve the UX for using Quark in Jadx"></a><strong>3. Second Goal - Improve the UX for using Quark in Jadx</strong></h3><p>– <span style="color:#FF3377"><em>Work #1:</em></span> Implemented Error Dialog.<br>– <span style="color:#FF3377"><em>Work #2:</em></span> Implemented Quark auto installation for Jadx.<br>– <span style="color:#FF3377"><em>Impact:</em></span> Improve the user experience for using Quark in Jadx.<br>– <span style="color:#FF3377"><em>Related PRs:</em></span> <strong><a href="https://github.com/skylot/jadx/pull/1203">PR: #1203</a></strong>, <strong><a href="https://github.com/skylot/jadx/pull/1202">PR: #1202</a></strong>, <strong><a href="https://github.com/skylot/jadx/pull/1199">PR: #1199</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#3-Improve-the-UX-for-using-Quark-in-Jadx">Go to details on this page</a>.</p><h3 id="4-Second-Goal-Provide-more-details-of-Quark’s-summary-report-in-Jadx"><a href="#4-Second-Goal-Provide-more-details-of-Quark’s-summary-report-in-Jadx" class="headerlink" title="4. Second Goal - Provide more details of Quark’s summary report in Jadx"></a><strong>4. Second Goal - Provide more details of Quark’s summary report in Jadx</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> We plan to provide more details so as to help users quickly locate malicious behaviors in the binary. However, during the GSoC, the founder of Jadx, <a href="https://github.com/skylot">Skylot</a> unexpectedly helped us implement this feature.<br>– <span style="color:#FF3377"><em>Impact:</em></span> With this added information, users can have an overview. Then they know where to start and can dive into the source codes.<br> – <span style="color:#FF3377"><em>Related Commit:</em></span> <strong><a href="https://github.com/skylot/jadx/commit/b5720bd14e9e9673a60f9aa8e8b1390efef39288">Commit: #b5720b</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#4-Provide-more-details-of-Quark%E2%80%99s-summary-report-in-Jadx">Go to details on this page</a>.</p><h3 id="5-Second-Goal-Integrate-Quark-to-MobSF"><a href="#5-Second-Goal-Integrate-Quark-to-MobSF" class="headerlink" title="5. Second Goal - Integrate Quark to MobSF"></a><strong>5. Second Goal - Integrate Quark to MobSF</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> An implementation of integration to MobSF was done and was merged.<br>– <span style="color:#FF3377"><em>Impact:</em></span> This can definitly help Quark to grow huge number of users since MobSF is a well-known project in mobile security.<br> – <span style="color:#FF3377"><em>Related PR:</em></span> <strong><a href="https://github.com/APKLab/APKLab/pull/135">PR: #135</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#5-Integrate-Quark-to-MobSF">Go to details on this page</a>.</p><h3 id="6-Second-Goal-Implement-behavior-map-of-Quark-to-APKLab"><a href="#6-Second-Goal-Implement-behavior-map-of-Quark-to-APKLab" class="headerlink" title="6. Second Goal - Implement behavior map of Quark to APKLab"></a><strong>6. Second Goal - Implement behavior map of Quark to APKLab</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> This feature is implemented and is submitted to APKLab. However, this PR is awaiting to be solved since we encountered some CI issues.<br>– <span style="color:#FF3377"><em>Impact:</em></span> This feature allows APKLab users to have a deeper insight of function/method calls in the suspicious binary.<br>– <span style="color:#FF3377"><em>Related PR:</em></span> <strong><a href="https://github.com/APKLab/APKLab/pull/135">PR: #135</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#6-Implement-behavior-map-of-Quark-to-APKLab">Go to details on this page</a>.</p><h3 id="7-Second-Goal-Improve-UI-UX-of-Quark-integration-in-APKLab"><a href="#7-Second-Goal-Improve-UI-UX-of-Quark-integration-in-APKLab" class="headerlink" title="7. Second Goal - Improve UI/UX of Quark integration in APKLab"></a><strong>7. Second Goal - Improve UI/UX of Quark integration in APKLab</strong></h3><p>– <span style="color:#FF3377"><em>Work:</em></span> Three issues were opened for improving Quark UI/UX in APKLab, more discussion are going with the founder of APKLab, <a href="https://github.com/Surendrajat">Surendrajat</a>.<br>– <span style="color:#FF3377"><em>Impact:</em></span> These issues help to improve the readability of Quark reports and performance of Quark in APKLab.<br>– <span style="color:#FF3377"><em>Related Issue:</em></span> <strong><a href="https://github.com/APKLab/APKLab/issues/142">Issue: #142</a></strong>, <strong><a href="https://github.com/APKLab/APKLab/issues/141">Issue: #141</a></strong>, <strong><a href="https://github.com/APKLab/APKLab/issues/140">Issue: #140</a></strong><br>– <span style="color:#FF3377"><em>Details:</em></span> <a href="#7-Improve-UI-UX-of-Quark-integration-in-APKLab">Go to details on this page</a>.</p><h2 id="Details"><a href="#Details" class="headerlink" title="Details"></a>Details</h2><h3 id="1-Implement-a-new-rule-generate-technique-for-quark-rule-generate"><a href="#1-Implement-a-new-rule-generate-technique-for-quark-rule-generate" class="headerlink" title="1. Implement a new rule generate technique for quark-rule-generate"></a>1. Implement a new rule generate technique for quark-rule-generate</h3><p><span style="color:#157040"><em><strong>1.1. Drawbacks of the old technique</strong></em></span></p><p>As described in <a href="https://quark-engine.readthedocs.io/en/latest/addRules.html">here</a>, we need two native APIs to construct one detection rule. The old technique simply finds all possible combinations of native APIs in the target apk. For example, if the target apk has N native APIs, then the old technique would generate N x N rules and verify all of them. This is time and disk volume consuming. And the worst of all, we found out that most of the rules are useless.</p><p><span style="color:#157040"><em><strong>1.2. The new technique</strong></em></span></p><p>The main goal of the new technique is to find valuable detection rules within relatively short time. Hence, we first calculate the number of API call for each API. Then, we sort the numbers. Then we define P (primary) as the set of 20% least used APIs and S (secondary) as the set of 80% most used APIs. In other words, the total number of API (N) is the sum of P and S.</p><p>The reason why we choose 20% least used APIs as primary APIs is because we find API such as toSting() is being called everywhere. And toString() is not a helpful API for malware researchers. Why 20%? The answer is simple, we believe in 20-80 rule and we’d like to give it try.</p><p>So, despite of the N x N combination. We now have four sets to choose. See Table 1.1. for our inferences on rule value and computational cost of each set. Apparently, set (P x P) is our top priority for hunting high value detection rules.</p><p><img src="https://i.imgur.com/j9CWeIq.png"></p><center>Table 1.1. The comparison of four different sets</center><br><p><span style="color:#157040"><em><strong>1.3. Experiment for the new technique:</strong></em></span></p><p>To prove our inferences, we choose <a href="https://github.com/AhMyth/AhMyth-Android-RAT">Ahmyth RAT</a> as our target APK.:</p><p><img src="https://i.imgur.com/CELHY7r.png"></p><center>Figure 1.1. The line chart for the ratio of the number of rules to search times</center><br><p>According to the graph, the Y axis represents (the number of 100% rules / the number of API searches). In other words,Y is the average number of 100% rules per API search. On the other hand, X axis represents the percentage of P.</p><p>The result shows perfectly that set PP has the greatest performance among all percentages of P. Set SS, not surprisingly, has the worst performance. Set PS and Set SP have almost the same performances. Last but not least, the result proves that 20-80 rule can be applied perfectly in the rule generating technique.</p><p>One drawback in this experiment is that we only use one target APK. Our future goal is to prove that our findings are still applicable in other APKs.</p><p><span style="color:#157040"><em><strong>Related PR:</strong></em></span></p><p>quark-rule-generate <a href="https://github.com/quark-engine/quark-rule-generate/pull/2">PR#2</a>: New rule generate technique</p><p><a href="#1-First-Goal-Implement-a-new-rule-generate-technique-for-quark-rule-generate">Go back to summary</a></p><hr><h3 id="2-Solve-CPU-idle-problem-for-quark-rule-generate"><a href="#2-Solve-CPU-idle-problem-for-quark-rule-generate" class="headerlink" title="2. Solve CPU idle problem for quark-rule-generate"></a>2. Solve CPU idle problem for quark-rule-generate</h3><p>The multiprocess evenly distribute APIs to each process (one CPU core) for analysis. However, some processes are idle when they finish the analysis of distributed APIs. And this is a waste of the CPU resource. Therefore, for maximizing the usage of the CPU, I implemented a feature that continuously checks the CPU status and reallocates the APIs that are yet to be analyzed to all CPU cores when the idle CPU core is found.</p><p><span style="color:#157040"><em><strong>Related PR:</strong></em></span></p><p>quark-rule-generate <a href="https://github.com/quark-engine/quark-rule-generate/pull/3">PR#3</a>: Fix the CPU idle problem</p><p><a href="#2-First-Goal-Solve-CPU-idle-problem-for-quark-rule-generate">Go back to summary</a></p><hr><h3 id="3-Improve-the-UX-for-using-Quark-in-Jadx"><a href="#3-Improve-the-UX-for-using-Quark-in-Jadx" class="headerlink" title="3. Improve the UX for using Quark in Jadx"></a>3. Improve the UX for using Quark in Jadx</h3><p><span style="color:#157040"><em><strong>3.1. Implemented Quark auto installation for Jadx</strong></em></span></p><p>In previous integration, Jadx users need to install quark themselves before using the quark analysis module. Therefore, for better UX, I implemented the auto installation of quark in <a href="https://github.com/skylot/jadx/pull/1199">PR#1199</a>.</p><p><span style="color:#157040"><em><strong>3.2. Implemented Error Dialog</strong></em></span></p><p>In the previuos integration of quark to Jadx, Error/Warning messages show only in the logger. In other words, users won’t have a clue when until they check the log message. Therefore, I implemented a feature that pops up the Error or Warning dialog when Quark is not working properly. See Figure 3.1. and Figure 3.2. for the demo.</p><p><img src="https://i.imgur.com/apIwqVr.png"></p><center>Figure 3.1. Error dialog</center><br><p><img src="https://i.imgur.com/UfibQJ1.png"></p><center>Figure 3.2. Warning dialog</center><br><p><span style="color:#157040"><em><strong>3.3. Implement the progress bar</strong></em></span></p><p>The time consumed for different analysis varies. And we think users can have better UX if they know the progress of the analysis. Therefore, I implemented a progress bar on the main window of Jadx to remind users the analysis progress. See Figure 3.3. for the demo.</p><p><img src="https://i.imgur.com/9iKPVQ1.gif"></p><center>Figure 3.3. Progress bar for Quark in Jadx</center><br><p><span style="color:#157040"><em><strong>Related PRs</strong></em></span></p><p>Jadx <a href="https://github.com/skylot/jadx/pull/1203">PR#1203</a>: Change Quark task to background task<br>Jadx <a href="https://github.com/skylot/jadx/pull/1202">PR#1202</a>: Add Error/Warning dialogs<br>Jadx <a href="https://github.com/skylot/jadx/pull/1199">PR#1199</a>: Improvements of Quark integration</p><p><a href="#3-Second-Goal-Improve-the-UX-for-using-Quark-in-Jadx">Go back to summary</a></p><hr><h3 id="4-Provide-more-details-of-Quark’s-summary-report-in-Jadx"><a href="#4-Provide-more-details-of-Quark’s-summary-report-in-Jadx" class="headerlink" title="4. Provide more details of Quark’s summary report in Jadx"></a>4. Provide more details of Quark’s summary report in Jadx</h3><p>We plan to provide more details so as to help users quickly locate malicious behaviors in the binary. With this added information, users can have an overview. Then they know where to start and can dive into the source codes.</p><p>However, during the GSoC, the founder of Jadx, Skylot unexpetedly helped us implement this feature. We appreciate his kindly and unexpected help. :D</p><p>See Figure 4.1. and Figure 4.2. for the demo.</p><p><img src="https://i.imgur.com/t9wcIrd.png"></p><center>Figure 4.1. The binary overview</center><br><p><img src="https://i.imgur.com/p9qp2SC.gif"></p><center>Figure 4.2. Jump to the source code </center><br><p><span style="color:#157040"><em><strong>Related Commits</strong></em></span></p><p>Jadx <a href="https://github.com/skylot/jadx/commit/b5720bd14e9e9673a60f9aa8e8b1390efef39288">commit b5720b:</a> fix(gui): improve Quark tasks scheduling and report viewer</p><p><a href="#4-Second-Goal-Provide-more-details-of-Quark%E2%80%99s-summary-report-in-Jadx">Go back to summary</a></p><hr><h3 id="5-Integrate-Quark-to-MobSF"><a href="#5-Integrate-Quark-to-MobSF" class="headerlink" title="5. Integrate Quark to MobSF"></a>5. Integrate Quark to MobSF</h3><p><span style="color:#157040"><em><strong>5.1. Add Quark Analysis Report</strong></em></span></p><p>An implementation of quark integration to MobSF is done and is merged. See Figure 5.1. for the demo.</p><p><img src="https://i.imgur.com/icBM873.png"></p><center>Figure 5.1. Quark analysis report in MobSF</center><br><p><span style="color:#157040"><em><strong>5.2. Dive into the Source Code</strong></em></span></p><p>We also implemented a killer feature in MobSF. Users can jump to where the suspicious behavior happens when clicking on the activity shows in Figure 5.1. See Figure 5.2. for the demo.</p><p><img src="https://i.imgur.com/L83yrws.gif"></p><center>Figure 5.2. The demo for source code overview</center><br><p><span style="color:#157040"><em><strong>Related PR:</strong></em></span></p><p>MobSF <a href="https://github.com/MobSF/Mobile-Security-Framework-MobSF/pull/1761">PR#1761</a>: Add Quark Engine as one of the static analyzers</p><p><a href="#5-Second-Goal-Integrate-Quark-to-MobSF">Go back to summary</a></p><hr><h3 id="6-Implement-behavior-map-of-Quark-to-APKLab"><a href="#6-Implement-behavior-map-of-Quark-to-APKLab" class="headerlink" title="6. Implement behavior map of Quark to APKLab"></a>6. Implement behavior map of Quark to APKLab</h3><p><span style="color:#157040"><em><strong>6.1. Add behavior map to APKlab</strong></em></span></p><p>Before we implemented this feature, we fix a permission issue for APKLab. See <a href="https://github.com/APKLab/APKLab/pull/135">PR#135</a> for more information. The behavior map is implemented. See Figure 6.1. and Figure 6.2. With the behavior map, users can quickly understand the relationship between the suspicious behaviors quark detected. However, this PR is awaiting to be solved since we encountered some CI issues.</p><p><img src="https://i.imgur.com/d5w5Z9I.gif"></p><center>Figure 6.1. Behavior map in APKLab</center><br><p><img src="https://i.imgur.com/r9EwLun.png"></p><center>Figure 6.2. Behavior map</center><br><p><span style="color:#157040"><em><strong>Related PR:</strong></em></span></p><p>APKLab <a href="https://github.com/APKLab/APKLab/pull/135">PR#135</a>: Quark integration improvement</p><p><a href="#6-Second-Goal-Implement-behavior-map-of-Quark-to-APKLab">Go back to summary</a></p><hr><h3 id="7-Improve-UI-UX-of-Quark-integration-in-APKLab"><a href="#7-Improve-UI-UX-of-Quark-integration-in-APKLab" class="headerlink" title="7. Improve UI/UX of Quark integration in APKLab"></a>7. Improve UI/UX of Quark integration in APKLab</h3><p><span style="color:#157040"><em><strong>7.1. [UX] Time Consuming when analyzing large size APKs</strong></em></span></p><p>In the previous integration of quark to APKLab, we’ve found out that it may take a long time when analyzing large size apks. There are several reasons we think might cause this problem. The problem could be the slow performance of Quark. Or the problem could be the slow performance when executing tools that works with Quark simultaneously.</p><p>As for the performance of Quark, our team has conducted a performance assessment and improvement proposal in another GSoC project. Therefore, in this project, we choose to make tools work with Quark to be executed asynchronously. So that users can use other features first and no time is wasted. See <a href="https://github.com/APKLab/APKLab/issues/140">issue#140</a> for my discussion with APKLab.</p><p><span style="color:#157040"><em><strong>7.2. [UX]More options for the suspicious behavior traversal</strong></em></span></p><p>This issue discusses that the Quark report only shows activities with 100% confidence, but we found out that most 80% confidence activities are also valuable.</p><p>Therefore, we add a checkbox to filter the percentage of confidence and provide a better UX for the suspicious traversal. See <a href="https://github.com/APKLab/APKLab/issues/141">issue#141</a> for my discussion with APKLab.</p><p><span style="color:#157040"><em><strong>7.3. [UI] Indention of the sub-item in Quark report</strong></em></span></p><p>Lack of indention of the sub-item in the Quark report may cause misunderstanding for users. Therefore, we opened an issue to discuss with APKLab. See <a href="https://github.com/APKLab/APKLab/issues/142">issue#142</a> for more information.</p><p><span style="color:#157040"><em><strong>Related Issues:</strong></em></span></p><p>APKLab <a href="https://github.com/APKLab/APKLab/issues/140">Issue#140</a> Quark analysis may take a long time<br>APKLab <a href="https://github.com/APKLab/APKLab/issues/141">Issue#141</a> Quark report only shows 100% confidence activities<br>APKLab <a href="https://github.com/APKLab/APKLab/issues/142">Issue#142</a> There are no indent spaces at the sub-item in the Quark report</p><p><a href="#7-Second-Goal-Improve-UI-UX-of-Quark-integration-in-APKLab">Go back to summary</a></p><h2 id="Acknowledgments"><a href="#Acknowledgments" class="headerlink" title="Acknowledgments"></a>Acknowledgments</h2><p>Thank Google, for providing such a great project.<br>Thank <a href="https://www.honeynet.org/">Honeynet Project</a> gives me the opportunity.<br>Thank my mentor, <a href="https://github.com/18z">KunYu Chen</a>, for all his sincere support and guidance.<br>Thank <a href="https://github.com/skylot/jadx">Jadx</a>, <a href="https://github.com/MobSF/Mobile-Security-Framework-MobSF">MobSF</a>, <a href="https://github.com/APKLab/APKLab">APKLab</a> for helping me reach my goal.<br>Thank <a href="https://www.ttc.org.tw/eng/">TTC</a> for providing me the working environment.</p>]]></content>
</entry>
</search>