BAR2019 Paper #14 Reviews and Comments =========================================================================== Paper #14 ACIDroid: A Practical App Cache Integrity Protection System on Android Runtime Review #14A =========================================================================== Overall merit ------------- 1. Reject (off-topic or having fundamental issues) Reviewer expertise ------------------ 3. Knowledgeable Paper summary ------------- This paper presents a technique to defend against a specific attack in Android called "app cache tampering attack." Authors test effectiveness of their approach on 14 apps. Comments for author ------------------- I think this paper is off-topic for this venue. Although papers presenting bytecode analysis techniques are in-scope, this paper does not show any novel analysis in this area. Specifically, the analysis performed is mostly just a comparison of class/method names and, in some cases, bytecode instructions. I am also not convinced by the threat model assumed in this paper. This paper assumes that an attacker is able to break the sandboxing mechanisms of Android in order to modify cache files. Typically, attackers reach this scenario by achieving root privileges. However, once root privileges are reached, an attacker can also circumvent ACIDroid checks. This is even stated by the authors: "The security of ACIDroid is somewhat limited when ACIDroid runs on the app itself. Without a platform-level secure execution environment, all self-integrity check mechanisms can be practically bypassed by reverse engineering and app repackaging [22] because the codes for the integrity check in apps can be modified by attackers." Authors claim that many users own rooted phones (beginning of Section III), but this does not necessarily mean that an attacker can easily get root privileges on those devices. In particular, in devices "rooted" using apps like SuperSu, the user can select apps to give root privileges to. Being a short paper, the paper should be long at most 6 pages, but it is currently long 7 pages long. To improve this paper, authors should discuss in details and quantify what are the benefit that the "app cache tampering attack" gives to an attacker who has already bypassed the device' s sandboxing mechanisms. One possibility is that "app cache tampering attack" can simplify (e.g., automate) attacks that otherwise would require a lot of reverse engineering to bypass apps' self-defense mechanisms. Authors could also explore the possibility of moving their app-integrity-verification code to a Trusted Execution Environment (e.g., TrustZone), as they briefly mentioned. Authors should also mention SafetyNet, a Google-provided mechanism to verify the integrity of apps (e.g., the code has not been changed) and of devices (the device firmware has not been modified). Can "app cache tampering attacks" be used to automatically bypass SafetyNet? Review #14B =========================================================================== Overall merit ------------- 2. Weak reject (major changes are necessary to publish the paper) Reviewer expertise ------------------ 3. Knowledgeable Paper summary ------------- Privileged attackers may find it easier to tamper with apps by editing ART cache files instead of the APKs. Leveraging knowledge of all possible Android optimizations, the authors improve storage and performance overhead compared to other cache-defense systems. Comments for author ------------------- I think the paper would benefit from making the threat model explicit. It is interesting that some apps are performing "APK self-defense", but what is the main benefit of a cache-based attack vs. changing other internals (such as the real cache or APK location)? How much would cache-defense hamper such a privileged attacker? It would especially help to compare with other systems that claim transparency (Magisk, in particular), with common app-repackaging, and with the app self-defense itself (a pre-requisite for the attack is that APK-verification systems work: could they protect cache files too?). The evaluation needs more detail. Were apps just started and stopped? What are the "common features" in all recorded logs that are detected to stop the timer? All run-times are in the order of milliseconds, which seems particularly low for such complex apps, especially since they self-verify (possibly even using a remote server). The table should also split AOSP-emulator and real-device, since the underlying performance is so different. Will the evaluation logs be released? Will ACIDroid be open-sourced? Partial inspection sound interesting, but is left largely unspecified. Which app portions can be omitted without compromising security? In particular, it should be clarified how it was used in the evaluation (end of V.B). Was it enough to check the first DEX file only? Should the evaluation be repeated in "full inspection" mode? Was Wan et al. [2] system also modified to only check the first DEX file? Other: - How "fixed" is the optimization system? Is it the same on every Android device? Major version? Could this defense lose the performance benefit due to the exploding number possible variations? - It would be nice to provide some examples of self-defenses that are used in the 14 evaluated apps - Some sections need rewording (e.g., end of II.A onward). Review #14C =========================================================================== Overall merit ------------- 2. Weak reject (major changes are necessary to publish the paper) Reviewer expertise ------------------ 3. Knowledgeable Paper summary ------------- The introduction of ahead-of-time compilation through ART in Android opens the door for user-mode rootkits that hide their presence by modifying the resulting native (cached) code (OAT file) that is generated on the device from the original APK file. To deal with this problem, the paper presents ACIDroid, an integrity verification mechanism that checks whether the cached code contains unexpected modifications. The idea is basically to re-generate parts of the optimized code starting from the original APK, and compare them with the cached ones. Any observed differences will be the result of tampering. Comments for author ------------------- I appreciate the effort of the authors to come up with a solution that will strengthen the integrity of ART apps. However, I feel that the work presented i) is based on wrong assumptions regarding the threat model and the capabilities of the attacker, and ii) offers a solution that is rather overcomplicated. First of all, the severity of the particular threat tackled by this paper is unclear. Tampering with the cached code requires root access. Consequently, there is nothing that prevents the attacker from tampering with ACIDroid itself. In that sense, the second "self-protection" deployment shown in Figure 6 can be easily bypassed. A possible counter-argument (not made in the paper, but I mention it to bring the discussion to the second point above) would be that attackers may want to avoid tampering with system code (e.g., due to verified boot). Their next option then is to tamper with user mode apps. If we assume that the attacker cannot tamper with ACIDroid's verification code (and system code in general - the scenario depicted in Figure 5), then why is such a complex technique needed in the first place? Wouldn't it be sufficient to just compute a cryptographic hash of the cached code upon installation (the paper assumes that the original APK is trusted), and just check the hash before running the app? This would detect even the slightest tampering, and avoid all the corner cases and weaknesses of the current scheme, which essentially performs only partial verification by re-optimizing the original DEX code and comparing it with the cached code. From a persistence perspective, the attack is very fragile because every time an app self-updates, the optimized code will be re-generated, and thus the attacker's modifications will be lost. The attacker essentially has to re-exploit the device and tamper with the app from scratch. I recommend to clearly discuss the assumed threat model, provide a detailed description of what the attacker can and cannot do in each case, what the resulting attacks can achieve in each case, and THEN discuss the design of ACIDroid, and what would be the requirements for offering meaningful protection in each case. Review #14D =========================================================================== Overall merit ------------- 2. Weak reject (major changes are necessary to publish the paper) Reviewer expertise ------------------ 2. Some familiarity Paper summary ------------- In this paper, a new methodology for preventing app cache tampering attacks on the Android Runtime is introduced. App cache tampering attacks are built on the fact that, since Android 4.4, Google introduced an install-time optimization and the optimized code is stored in the app cache, in order to speed up the execution of the application at runtime. This means that, even if some sensitive applications already provide a mechanism to check the signature of the installed APK with the expected one, the integrity of the optimized code that resides in the application cache is not thoroughly verified. In particular, the authors improve an existing technique to prevent such kind of attacks developed by Wan et al. [1], which uses a precomputation of hash values on DEX codes, by reducing the time and storage needed to precompute these hashes by performing this operation on the optimized version of the DEX codes, which are smaller in size. After this, the authors explain the technique presented, and verify the effectiveness of it in terms of time and storage overhead against the base case (no integrity mechanism is introduced), and against the integrity mechanism previously introduced by Wan et al. [1], by implementing the technique presented and applying it to 14 popular Android apps. [1]: J. Wan, M. Zulkernine, P. Eisen, and C. Liem, “Defending Application Cache Integrity of Android Runtime,” in Proceedings of the 13th Inter- national Conference on Information Security Practice and Experience. Springer, 2017. Comments for author ------------------- The threat model under which this work is developed, is that a potential attacker has root access on the device on which the attack is performed, in order to be able to write in the cache of another application. Though some data is presented supporting the fact that almost 7% of the Android devices are currently rooted, not all of them have been rooted maliciously, and, if not, the application would need an explicit grant for those root privileges (think SuperSU), and this reduces the surface available for the attack. Moreover, once an application is granted root privileges, the security of the other running applications would not be already compromised, due to different tampering techniques such as silently installing a malicious APK. The evaluation section is quite complete and covers a lot of different scenarios, since different Android versions, optimization level and hardware architectures have been analyzed, but a better explanation of the evaluation procedure needs to be provided (the use of emulators, which version of ACIDroid has been used: full inspection or partial inspection). Another critical point is that it is not completely clear at which level ACIDroid is implemented. It seems to be that the verification is done by Android (since for example in section VI, paragraph a it is mentioned that ACIDroid can be incorporated in the integrity verification procedure, which is done by Android, but at the same time in the same section, in paragraph b, it is said that the security of ACIDroid is limited when it runs on the app itself). Here follows a bullet points list which highlights other specific problems found in the paper: - In section I, in the summary of the main contributions, it seems that the 18 converting rules to optimized DEX codes are a contribution of this paper, only in section IV paragraph B the fact that they are something already part of ART. - Even if the paper is a short-paper, it consists of 7 pages, since part of section VI, and section VII are on the seventh page. - In section III, there is a mention to smali code, maybe provide a brief explanation. - Section III explains that the threat model encompasses a rooted device. This kind of situation is quite strong, and having access to root privileges on the device would easily lead to other type of attacks to be performed, like tampering with the requests to the external server to obtain the secret key needed for running the application. It would have been maybe more interesting to investigate scenarios where the tampering with the app cache is performed by interacting directly with the memory storage device through physical access. - There is no analysis about a situation where ACIDroid could provide a false positive, even if this situation cannot arise, provide an explanation for it. - In section V.a, it is said that the experimental results were conducted on 3 different Android versions, while later 4 versions are listed. - In section V.a, there are repeated mentions to AOSP images, without saying how these images were actually run (probably using an Android emulator?). - In section V.b, the method to acquire the ready time is not well explained and documented. - In section V.b, when discussing the storage overhead of the method proposed by Wan et al. [1], it is supposed that we need to store on a certain device the signatures for a lot of Android version, architectures and so on. Why not storing only the signatures regarding the actual architecture and Android version used (the Play Store supports deploying device-specific APKs, or even keeping only the interesting hashes at installation time, since we assume that ACIDroid runs at Android level, assumption that is explained above)? At very least an APK per-architecture could be produced containing only the necessary signatures. - In section IV.a, it is said that there are two possible verification methods, one that provides full inspection and the other which provides partial inspection. Later on, in section V, paragraph b, when discussing the experimental results it is said that the only inspected file was the first DEX file. How are these two things related? Did you use partial inspection in the evaluation, because the criterion used seems different (the partial inspection is said to only check a part of the DEX files). How much overhead is caused by full inspection, especially compared to the preexisting method? - In section VI, the mechanism that uses an external server for verification, implies that an application can not be used without internet connection. This is a severe limitation for limited benefit, given that on a rooted device the input of an application cannot really be trusted. - In section VI, towards the end, when talking about the verification method without an external server, there is a repetition of the explanation on how ACIDroid works. - There are quite a few grammatical and syntactical errors, and periods that are not well constructed and need multiple reads to be understood. These aspects make the reading of the paper difficult in some passages. Due to the major concerns, in particular concerning the threat model, we do not deem the paper ready for publication.