博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
分析C++ Command-line参数
阅读量:6237 次
发布时间:2019-06-22

本文共 4322 字,大约阅读时间需要 14 分钟。

hot3.png

预注:命令行(commandline)被操作系统的命令分析器(/往后简称cmdlineparser)分解到命令参数argv[0]…[n],这里,commandline是入料,argv是出品.

Microsoft C/C++ 程序引导代码使用以下规则解析操作系统命令行中给出的参数:

  • cmdlineparser用空白字符从commandline中分隔出argv;空白字符可以是一个空格(0x20)或制表符(0x09);注意,空白字符不一定就分割了argv,因为空白字符可能是argv的一部分
  • 相比0x20和0x09,字符^(0x5E) 未被识别为转义符或者分隔符;出品argv之前,commandline由cmdlineparser完全处理
  • commandline中,双引号括起来的字符串"string"被解释为单个参数,即使其中包含空格0x20,譬如"a string",解析为a string; 带引号的字符串可以嵌入在参数内,譬如d"e f"g,将被cmdlineparser解析为de fg
  • commandline中,前面有反斜杠(0x5C)的双引号 (\") 被解释为argv中的双引号字符 (")
  • 承4.,反斜杠在argv中按其原义解释,除非它们紧位于双引号之前
  • commandline中,如果偶数个反斜杠后跟一个双引号,每对反斜杠将被cmdlineparser解析为argv中的一个反斜杠;而紧跟后面的那个双引号将被cmdlineparser当作分隔符,等价于commandline中的空白字符
  • commandline中,如果奇数个反斜杠后跟一个双引号,每对反斜杠将被cmdlineparser解析为argv中的一个反斜杠;剩下的反斜杠+双引号按4.被转义解释为双引号

以上这段文字翻译自 ,主要还是本人理解的语义。原文如下

Microsoft C/C++ startup code uses the following rules when interpreting arguments given on the operating system command line:

  • Arguments are delimited by white space, which is either a space or a tab.
  • The caret character (^) is not recognized as an escape character or delimiter. The character is handled completely by the command-line parser in the operating system before being passed to the argv array in the program.
  • A string surrounded by double quotation marks ("string") is interpreted as a single argument, regardless of white space contained within. A quoted string can be embedded in an argument.
  • A double quotation mark preceded by a backslash (\") is interpreted as a literal double quotation mark character (").
  • Backslashes are interpreted literally, unless they immediately precede a double quotation mark.
  • If an even number of backslashes is followed by a double quotation mark, one backslash is placed in the argv array for every pair of backslashes, and the double quotation mark is interpreted as a string delimiter.
  • If an odd number of backslashes is followed by a double quotation mark, one backslash is placed in the argv array for every pair of backslashes, and the double quotation mark is "escaped" by the remaining backslash, causing a literal double quotation mark (") to be placed in argv.

示例

下面的过程演示如何通过命令行参数:

 
// command_line_arguments.cpp // compile with: /EHsc #include < iostream > using namespace std; int main( int argc, // Number of strings in array argv char * argv[], // Array of command-line argument strings char * envp[] ) // Array of environment variable strings { int count; // Display each command-line argument. cout << " \nCommand-line arguments:\n " ; for ( count = 0 ; count < argc; count ++ ) cout << " argv[ " << count << " ] " << argv[count] << " \n " ; }

下表显示示例输入,并预期的输出,演示上面的规则列表

命令行输入       |   argv [1]  |   argv [2]   |   argv [3]

-----------------|-------------|--------------|---------------
"abc" d e        |   abc       |   d          |   e
a\\b d"e f"g h   |   a\\b      |   de fg      |   h
a\\\"b c d       |   a\"b      |   c          |   d
a\\\\"b c" d e   |   a\\b c    |   d          |   e

/

又:

有关连在一起的多个双引号的解析,非常狗血,请参考讨论

  • (/为便于阅读,但请把你浏览器的字符集设置为ISO-8859-1,然后ZoomIn视图)

尤其是 中的这个补充说明:

  • And here's the missing undocumented rule:
    If a closing " is followed immediately by another ", the 2nd " is accepted literally and added to the parameter.

及其算法:

5.10  The Microsoft C/C++ Command Line Parameter Parsing Algorithm

The following algorithm was reverse engineered by disassembling a small C program compiled using Microsoft Visual C++ and examining the disassembled code:

1. Parse off parameter 0 (the program filename)

    * The entire parameter may be enclosed in double quotes (it handles double quoted parts)
      (Double quotes are necessary if there are any spaces or tabs in the parameter)
    * There is no special processing of backslashes (\)

2. Parse off next parameter:

    a. Skip over multiple spaces/tabs between parameters
      LOOP
    b. Count the backslashes (\). Let m = number of backslashes. (m may be zero.)
    c. IF next character following m backslashes is a double quote:
           If m is even (or zero)
                if currently in a double quoted part
                   IF next character is also a "
                        move to next character (the 2nd ". This character will be added to the parameter.)
                   ELSE
                        set flag to not add this " character to the parameter
                   ENDIF
                    toggle double quoted part flag
               else
                    set flag to not add this " character to the parameter
               endif
           Endif
            m = m/2 (floor divide e.g. 0/2=0, 1/2=0, 2/2=1, 3/2=1, 4/2=2, 5/2=2, etc.)
       ENDIF
    d. add m backslashes
    e. add this character to our parameter
      ENDLOOP

转载于:https://my.oschina.net/jacobin/blog/153257

你可能感兴趣的文章
二叉树中找两个结点的最近的公共祖先结点
查看>>
Mac下sqlite3的学习总结
查看>>
基本配置实验
查看>>
使用适合的质量工具
查看>>
Linux 必学和要掌握的路径
查看>>
WBS分解
查看>>
centos5.6安装FTP
查看>>
http-equiv,很强大
查看>>
安装字体与ubuntu-tweak
查看>>
平均值方法:Avg API-Medoo使用指南
查看>>
centos6,7没有安装ifconfig命令的解决方法
查看>>
web页面禁用右键、禁用左键、禁止查看源代码、禁用触摸板
查看>>
Linux Kernel Device Tree 配置框架
查看>>
笔记:Python进行数据库文件导出备份
查看>>
Android开发学习记录(2015-05-19 23:05:34更新)
查看>>
一封高三老师,写给进入大学的学生的信,看完沉思良久
查看>>
解决checkbox选中但是不显示打钩的问题
查看>>
大数据公司如何实现标准化服务输出?NO.410华量软件
查看>>
bias和variance
查看>>
SpringBoot基础教程2-1-1 搭建RESTful风格Web服务
查看>>