美文网首页
proguard源码分析一 参数解析

proguard源码分析一 参数解析

作者: 获取失败 | 来源:发表于2022-01-04 16:10 被阅读0次

    前段时间由于项目原因,需要对proguard做一些定制化工作,因此克隆了一份proguard源码下来对它进行了些研究跟改造。从本篇开始,我将会通过一个系列的文章,从源码出发,跟大家一起分析一下proguard的原理,本篇中研究的proguard源码版本是5.3.4

    proguard的整个执行流程可以大致的分为以下几个阶段


    • 解析参数
      proguard的入口函数在ProGuard.java文件里,在入口函数main函数里面,首先是new了一个ConfigurationParser对象负责解析input args,解析出来的内容会通过一个类型为Configuration的对象来保存,代码如下:
    /**
     * The main method for ProGuard.
     */
    public static void main(String[] args)
    {
        //此处省略部分代码...
        // Create the default options.
        Configuration configuration = new Configuration();
        try
        {
            // Parse the options specified in the command line arguments.
            ConfigurationParser parser = new ConfigurationParser(args,
                                                                    System.getProperties());
            try
            {
                parser.parse(configuration);
            }
            finally
            {
                parser.close();
            }
            // Execute ProGuard with these options.
            new ProGuard(configuration).execute();
        }
        //此处省略部分代码...
        System.exit(0);
    }
    

    ConfigurationParser会在内部又new了一个ArgumentWordReader对象来负责解析输入进来的参数

    /**
     * Creates a new ConfigurationParser for the given String arguments,
     * with the given base directory and the given Properties.
     */
    public ConfigurationParser(String[]   args,
                                File       baseDir,
                                Properties properties) throws IOException
    {
        this(new ArgumentWordReader(args, baseDir), properties);
    }
    
    /**
     * Creates a new ConfigurationParser for the given word reader and the
     * given Properties.
     */
    public ConfigurationParser(WordReader reader,
                                Properties properties) throws IOException
    {
        this.reader     = reader;
        this.properties = properties;
        readNextWord();
    }
    

    readNextWord的时候本质上是会调用ArgumentWordReader的nextWord接口来开始解析参数名来,nextWord的实现也比较简单,就是一些字符串的判断与裁剪,下面贴出一段逻辑出来分析

    /**
     * Reads a word from this WordReader, or from one of its active included
     * WordReader objects.
     *
     * @param isFileName         return a complete line (or argument), if the word
     *                           isn't an option (it doesn't start with '-').
     * @param expectSingleFile   if true, the remaining line is expected to be a
     *                           single file name (excluding path separator),
     *                           otherwise multiple files might be specified
     *                           using the path separator.
     * @return the read word.
     */
    public String nextWord(boolean isFileName,
                            boolean expectSingleFile) throws IOException
    {
        //此处省略部分代码...
        currentWord = null;
        // Make sure we have a non-blank line.
        while (currentLine == null || currentIndex == currentLineLength)
        {
            //读取下一行输入参数...
            currentLine = nextLine();
            if (currentLine == null)
            {
                return null;
            }
    
            currentLineLength = currentLine.length();
    
            //跳过空格符...
            // Skip any leading whitespace.
            currentIndex = 0;
            while (currentIndex < currentLineLength &&
                    Character.isWhitespace(currentLine.charAt(currentIndex)))
            {
                currentIndex++;
            }
    
            // Remember any leading comments.
            if (currentIndex < currentLineLength &&
                isComment(currentLine.charAt(currentIndex)))
            {
                // Remember the comments.
                String comment = currentLine.substring(currentIndex + 1);
                currentComments = currentComments == null ?
                    comment :
                    currentComments + '\n' + comment;
    
                // Skip the comments.
                currentIndex = currentLineLength;
            }
        }
    
        //找到了输入参数的startIndex
        // Find the word starting at the current index.
        int startIndex = currentIndex;
        int endIndex;
    
        char startChar = currentLine.charAt(startIndex);
        //此处省略部分代码...
        else
        {
            // The next word is a simple character string.
            // Find the end of the line, the first delimiter, or the first
            // white space.
            while (currentIndex < currentLineLength)
            {
                char currentCharacter = currentLine.charAt(currentIndex);
                if (isNonStartDelimiter(currentCharacter)    ||
                    Character.isWhitespace(currentCharacter) ||
                    isComment(currentCharacter)) {
                    break;
                }
    
                currentIndex++;
            }
    
            endIndex = currentIndex;
        }
    
        // Remember and return the parsed word.
        currentWord = currentLine.substring(startIndex, endIndex);
        return currentWord;
    }
    

    这里举个简单的例子,譬如执行java –jar proguard.jar -injars test.jar,nextWord这里就能把-injars这个参数keyword给解析出来了,名字解析出来了,接着就需要解析它的参数,回到ConfigurationParser的parse方法里,我们能看到,keyword给解析出来了,接着会根据不用的keyword会有一套不同的parse代码,最后会通过一个while循环,把所有input的参数都给解析出来,代码如下:

    /**
     * Parses and returns the configuration.
     * @param configuration the configuration that is updated as a side-effect.
     * @throws ParseException if the any of the configuration settings contains
     *                        a syntax error.
     * @throws IOException if an IO error occurs while reading a configuration.
     */
    public void parse(Configuration configuration)
    throws ParseException, IOException
    {
        while (nextWord != null)
        {
            lastComments = reader.lastComments();
    
            // First include directives.
            if      (ConfigurationConstants.AT_DIRECTIVE                                     .startsWith(nextWord) ||
                        ConfigurationConstants.INCLUDE_DIRECTIVE                                .startsWith(nextWord)) configuration.lastModified                          = parseIncludeArgument(configuration.lastModified);
            else if (ConfigurationConstants.BASE_DIRECTORY_DIRECTIVE                         .startsWith(nextWord)) parseBaseDirectoryArgument();
    
            // Then configuration options with or without arguments.
            else if (ConfigurationConstants.INJARS_OPTION                                    .startsWith(nextWord)) configuration.programJars                           = parseClassPathArgument(configuration.programJars, false);
            else if (ConfigurationConstants.OUTJARS_OPTION                                   .startsWith(nextWord)) configuration.programJars                           = parseClassPathArgument(configuration.programJars, true);
            //篇幅原因 下面省略掉一波类似代码....
            else
            {
                throw new ParseException("Unknown option " + reader.locationDescription());
            }
        }
    }
    
    • 保存解析参数
      前面我们提到了proguard解析出来的所有input参数会被保存到类型为Configuration的对象里面,这个对象会贯穿整个proguard过程,包括了proguard实例化ClassPool 读取ProgramClass LibraryClass shrink的时候需要保留哪些类方法,obfuscate的时候取mapping file来做混淆等等,都需要先从Configuration对象里获得参数。
    /*
     * ProGuard -- shrinking, optimization, obfuscation, and preverification
     *             of Java bytecode.
     *
     * Copyright (c) 2002-2016 Eric Lafortune @ GuardSquare
     *
     * This program is free software; you can redistribute it and/or modify it
     * under the terms of the GNU General Public License as published by the Free
     * Software Foundation; either version 2 of the License, or (at your option)
     * any later version.
     *
     * This program is distributed in the hope that it will be useful, but WITHOUT
     * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
     * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
     * more details.
     *
     * You should have received a copy of the GNU General Public License along
     * with this program; if not, write to the Free Software Foundation, Inc.,
     * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
     */
    package proguard;
    
    import java.io.File;
    import java.util.List;
    
    /**
     * The ProGuard configuration.
     *
     * @see ProGuard
     *
     * @author Eric Lafortune
     */
    public class Configuration
    {
        public static final File STD_OUT = new File("");
    
        ///////////////////////////////////////////////////////////////////////////
        // Keep options.
        ///////////////////////////////////////////////////////////////////////////
    
        /**
         * A list of {@link KeepClassSpecification} instances, whose class names and
         * class member names are to be kept from shrinking, optimization, and/or
         * obfuscation.
         */
        public List      keep;
    
    
        ///////////////////////////////////////////////////////////////////////////
        // Shrinking options.
        ///////////////////////////////////////////////////////////////////////////
    
        /**
         * Specifies whether the code should be shrunk.
         */
        public boolean   shrink                           = true;
    
        /**
         * Specifies whether the code should be optimized.
         */
        public boolean   optimize                         = true;
    
        public boolean   optimizeNoSideEffects           = false;
    
        /**
         * A list of <code>String</code>s specifying the optimizations to be
         * performed. A <code>null</code> list means all optimizations. The
         * optimization names may contain "*" or "?" wildcards, and they may
         * be preceded by the "!" negator.
         */
        public List      optimizations;
    
        /**
         * A list of {@link ClassSpecification} instances, whose methods are
         * assumed to have no side effects.
         */
        public List      assumeNoSideEffects;
    
        /**
         * Specifies whether the access of class members can be modified.
         */
        public boolean   allowAccessModification          = false;
    
        ///////////////////////////////////////////////////////////////////////////
        // Obfuscation options.
        ///////////////////////////////////////////////////////////////////////////
    
        /**
         * Specifies whether the code should be obfuscated.
         */
        public boolean   obfuscate                        = true;
    
        /**
         * An optional output file for listing the obfuscation mapping.
         * An empty file name means the standard output.
         */
        public File      printMapping;
    
        /**
         * An optional input file for reading an obfuscation mapping.
         */
        public File      applyMapping;
    
        /**
         * An optional name of a file containing obfuscated class member names.
         */
        public File      obfuscationDictionary;
    
        /**
         * A list of <code>String</code>s specifying package names to be kept.
         * A <code>null</code> list means no names. An empty list means all
         * names. The package names may contain "**", "*", or "?" wildcards, and
         * they may be preceded by the "!" negator.
         */
        public List      keepPackageNames;
    
    
        /**
         * Specifies whether to print verbose messages.
         */
        public boolean   verbose                          = false;
    
        /**
         * A list of <code>String</code>s specifying a filter for the classes for
         * which not to print notes, if there are noteworthy potential problems.
         * A <code>null</code> list means all classes. The class names may contain
         * "**", "*", or "?" wildcards, and they may be preceded by the "!" negator.
         */
        public List      note                             = null;
    
        /**
         * A list of <code>String</code>s specifying a filter for the classes for
         * which not to print warnings, if there are any problems.
         * A <code>null</code> list means all classes. The class names may contain
         * "**", "*", or "?" wildcards, and they may be preceded by the "!" negator.
         */
        public List      warn                             = null;
    
        /**
         * Specifies whether to ignore any warnings.
         */
        public boolean   ignoreWarnings                   = false;
    }
    

    Configuration里面的字段比较多,这里我只保留了部分比较常见的参数,这些参数基本就是我们平时会在配置文件里面会配置到的。这里我们只分析一下比较重要的keep字段,我们在配置文件里面写的keep规则最终就是会被保存到这个字段里头去的。

    回到ConfigurationParser对象的parse方法里,当ArgumentWordReader解析出来的keyword是 -keep -keepclassmembers -keepclasseswithmembers -keepnames -keepclassmembernames -keepclasseswithmembernames等等这些时,proguard便会解析后面的keep参数,把我们想要保留的类规则给读取出来(温馨提示,如果想知道proguard到底还支持哪些功能,直接来parse方法里找keyword就知道了)

    public void parse(Configuration configuration)
    throws ParseException, IOException
    {
        while (nextWord != null)
        {
            lastComments = reader.lastComments();
            else if (ConfigurationConstants.IF_OPTION                                        .startsWith(nextWord)) configuration.keep                                  = parseIfCondition(configuration.keep);
            else if (ConfigurationConstants.KEEP_OPTION                                      .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, true,  false, false, null);
            else if (ConfigurationConstants.KEEP_CLASS_MEMBERS_OPTION                        .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, false, false, null);
            else if (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBERS_OPTION                 .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, true,  false, null);
            else if (ConfigurationConstants.KEEP_NAMES_OPTION                                .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, true,  false, true,  null);
            else if (ConfigurationConstants.KEEP_CLASS_MEMBER_NAMES_OPTION                   .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, false, true,  null);
            else if (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBER_NAMES_OPTION            .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, true,  true,  null);
            else if (ConfigurationConstants.PRINT_SEEDS_OPTION                               .startsWith(nextWord)) configuration.printSeeds                            = parseOptionalFile();
        }
    }
    

    可以看到不管你怎么写keep规则的,最终的读取其实都是通过parseKeepClassSpecificationArguments方法来读取的,parseKeepClassSpecificationArguments的功能比较简单,内部只是new了个ArrayList,至于真正的解析都交给了重载方法去实现了,

    /**
     * Parses and returns a class specification to keep classes and class
     * members.
     * @throws ParseException if the class specification contains a syntax error.
     * @throws IOException    if an IO error occurs while reading the class
     *                        specification.
     */
    private KeepClassSpecification parseKeepClassSpecificationArguments(boolean            markClasses,
                                                                        boolean            markConditionally,
                                                                        boolean            allowShrinking,
                                                                        ClassSpecification condition)
    throws ParseException, IOException
    {
        boolean markDescriptorClasses = false;
        boolean markCodeAttributes    = false;
        //boolean allowShrinking        = false;
        boolean allowOptimization     = false;
        boolean allowObfuscation      = false;
    
        // Read the keep modifiers.
        while (true)
        {
            readNextWord("keyword '" + ConfigurationConstants.CLASS_KEYWORD +
                            "', '"      + JavaConstants.ACC_INTERFACE +
                            "', or '"   + JavaConstants.ACC_ENUM + "'",
                            false, false, true);
    
            if (!ConfigurationConstants.ARGUMENT_SEPARATOR_KEYWORD.equals(nextWord))
            {
                // Not a comma. Stop parsing the keep modifiers.
                break;
            }
    
            readNextWord("keyword '" + ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION +
                            "', '"      + ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION +
                            "', or '"   + ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION + "'");
    
            if      (ConfigurationConstants.INCLUDE_DESCRIPTOR_CLASSES_SUBOPTION.startsWith(nextWord))
            {
                markDescriptorClasses = true;
            }
            else if (ConfigurationConstants.INCLUDE_CODE_SUBOPTION              .startsWith(nextWord))
            {
                markCodeAttributes    = true;
            }
            else if (ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION           .startsWith(nextWord))
            {
                allowShrinking        = true;
            }
            else if (ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION        .startsWith(nextWord))
            {
                allowOptimization     = true;
            }
            else if (ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION         .startsWith(nextWord))
            {
                allowObfuscation      = true;
            }
            else
            {
                throw new ParseException("Expecting keyword '" + ConfigurationConstants.INCLUDE_DESCRIPTOR_CLASSES_SUBOPTION +
                                            "', '"                + ConfigurationConstants.INCLUDE_CODE_SUBOPTION +
                                            "', '"                + ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION +
                                            "', '"                + ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION +
                                            "', or '"             + ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION +
                                            "' before " + reader.locationDescription());
            }
        }
    
        // Read the class configuration.
        ClassSpecification classSpecification =
            parseClassSpecificationArguments(false);
    
        // Create and return the keep configuration.
        return new KeepClassSpecification(markClasses,
                                            markConditionally,
                                            markDescriptorClasses,
                                            markCodeAttributes,
                                            allowShrinking,
                                            allowOptimization,
                                            allowObfuscation,
                                            condition,
                                            classSpecification);
    }
    

    markClasses markConditionally参数会在shrink阶段被使用到,用来标识类是否需要被保留,这里我们能看到直接用-keep的时候 markClasses会传true,意味着类会被保留下来,而用-keepclassmembers的时候markClasses是传了false,表示类还是有可能会shrink阶段被剔除掉的,通过阅读proguard的源码,我们能更加深入的了解到了-keep规则的一些用法了。

    parseKeepClassSpecificationArguments方法的前面一部分也非常的好理解,也是通过读取keyword,通过字符的判断的方式来获得allowShrinking等一些传参了,举个例子,譬如有以下keep规则
    -keep, allowObfuscation class com.test.test
    这里就能把allowObfuscation参数读取出来了,test类虽然被keep住,但也能被混淆。

    接着的parseClassSpecificationArguments会解析出类更加详细的keep规则,譬如类名、父类、类的哪些字段需要被保留、类的哪些方法需要被保留等等,最后会创建出KeepClassSpecification对象并且保存所有解析出来的参数,KeepClassSpecification最终会被保存到Configuration对象的keep成员里。

    • 总结
      本节主要介绍了proguard的几个工作阶段,以及分析了proguard的参数解析阶段的整个过程,下一节我们将会继续分析proguard里面的ClassPool ProgramClass等等的初始化,介绍下proguard是怎么把class文件解析到内存里面并且是如何管理起来的。

    相关文章

      网友评论

          本文标题:proguard源码分析一 参数解析

          本文链接:https://www.haomeiwen.com/subject/yiakdltx.html