Getopt(3) — linux man page

History

A long-standing issue with command line programs was how to specify options; early programs used many ways of doing so, including single character options (), multiple options specified together ( is equivalent to ), multicharacter options (), options with arguments (, , ), and different prefix characters (, , ).

The getopt function was written to be a standard mechanism that all programs could use to parse command-line options so that there would be a common interface on which everyone could depend. As such, the original authors picked out of the variations support for single character options,
multiple options specified together, and options with arguments ( or ), all controllable by an option string.

getopt dates back to at least 1980 and was first published by AT&T at the 1985 UNIFORUM conference in Dallas, Texas, with the intent for it to be available in the public domain.[citation needed] Versions of it were subsequently picked up by other flavors of Unix (4.3BSD, Linux, etc.). It is specified in the POSIX.2 standard as part of the unistd.h header file. Derivatives of getopt have been created for many programming languages to parse command-line options.

Extensions

getopt is a system dependent function, and its behavior depends on the implementation in the C library. Some custom implementations like gnulib are available, however.

The conventional (POSIX and BSD) handling is that the options end when the first non-option argument is encountered, and that getopt would return -1 to signal that. In the glibc extension, however, options are allowed anywhere for ease of use; getopt implicitly permutes the argument vector so it still leaves the non-options in the end. Since POSIX already has the convention of returning -1 on and skipping it, one can always portably use it as an end-of-options signifier.

A GNU extension, getopt_long, allows parsing of more readable, multicharacter options, which are introduced by two dashes instead of one. The choice of two dashes allows multicharacter options () to be differentiated from single character options specified together (). The GNU extension also allows an alternative format for options with arguments: . This interface proved popular, and has been taken up (sans the permution) by many BSD distributions including FreeBSD as well as Solaris. An alternative way to support long options is seen in Solaris and Korn Shell (extending optstring), but it was not as popular.

Another common advanced extension of getopt is resetting the state of argument parsing; this is useful as a replacement of the options-anyware GNU extension, or as a way to «layer» a set of command-line interface with different options at different levels. This is achieved in BSD systems using an optreset variable, and on GNU systems by setting optind to 0.

A common companion function to is . It parses a string of comma-separated sub-options.

使用 getopt_long()

由于 getopt_long_demo 几乎与刚刚讨论的 getopt_demo 代码一样,因此我将仅对更改的代码进行说明。由于现在已经有了更大的灵活性,因此还将添加对 选项(没有对应的短选项)的支持。

函数在 头文件(而非 )中,因此将需要将该头文件包含进来(请参见)。我还包含了 ,因为将稍后使用 来帮助确定处理的是哪个长参数。

清单 11. 其他头文件
#include <getopt.h>
#include <string.h>

您已经为 选项在 中添加了一个标志(请参见),并创建了 数组来存储关于此程序支持的长选项的信息。除了 外,所有的参数都与现有短选项对应(例如, 等同于 )。通过在选项结构中包含其短选项等效项,可以在不向程序添加任何其他代码的情况下处理等效的长选项。

清单 12. 扩展后的参数
struct globalArgs_t {
    int noIndex;                /* -I option */
    char *langCode;             /* -l option */
    const char *outFileName;    /* -o option */
    FILE *outFile;
    int verbosity;              /* -v option */
    char **inputFiles;          /* input files */
    int numInputFiles;          /* # of input files */
    int randomized;             /* --randomize option */
} globalArgs;

static const char *optString = "Il:o:vh?";

static const struct option longOpts[] = {
    { "no-index", no_argument, NULL, 'I' },
    { "language", required_argument, NULL, 'l' },
    { "output", required_argument, NULL, 'o' },
    { "verbose", no_argument, NULL, 'v' },
    { "randomize", no_argument, NULL, 0 },
    { "help", no_argument, NULL, 'h' },
    { NULL, no_argument, NULL, 0 }
};

将 调用更改为了 ,除了 的参数外,它还接受 数组和 指针 ()。当 返回 时, 所指向的整数将设置为当前找到的长选项的索引。

清单 13. 新的经改进的选项处理
opt = getopt_long( argc, argv, optString, longOpts, &longIndex );
    while( opt != -1 ) {
        switch( opt ) {
            case 'I':
                globalArgs.noIndex = 1; /* true */
                break;
                
            case 'l':
                globalArgs.langCode = optarg;
                break;
                
            case 'o':
                globalArgs.outFileName = optarg;
                break;
                
            case 'v':
                globalArgs.verbosity++;
                break;
                
            case 'h':   /* fall-through is intentional */
            case '?':
                display_usage();
                break;

            case 0:     /* long option without a short arg */
                if( strcmp( "randomize", longOpts.name ) == 0 ) {
                    globalArgs.randomized = 1;
                }
                break;
                
            default:
                /* You won't actually get here. */
                break;
        }
        
        opt = getopt_long( argc, argv, optString, longOpts, amp;longIndex );
    }

我还添加了 的 case,以便处理任何不与现有短选项匹配的长选项。在此例中,只有一个长选项,但代码仍然使用 来确保它是预期的那个选项。

这样就全部搞定了;程序现在支持更为详细(对临时用户更加友好)的长选项。

Parsing Command-Line Arguments

Python provided a getopt module that helps you parse command-line options and arguments. This module provides two functions and an exception to enable command line argument parsing.

getopt.getopt method

This method parses the command line options and parameter list. Following is a simple syntax for this method −

getopt.getopt(args, options, )

Here is the detail of the parameters −

  • args − This is the argument list to be parsed.

  • options − This is the string of option letters that the script wants to recognize, with options that require an argument should be followed by a colon (:).

  • long_options − This is an optional parameter and if specified, must be a list of strings with the names of the long options, which should be supported. Long options, which require an argument should be followed by an equal sign (‘=’). To accept only long options, options should be an empty string.

  • This method returns a value consisting of two elements − the first is a list of (option, value) pairs, the second is a list of program arguments left after the option list was stripped.

  • Each option-and-value pair returned has the option as its first element, prefixed with a hyphen for short options (e.g., ‘-x’) or two hyphens for long options (e.g., ‘—long-option’).

Exception getopt.GetoptError

This is raised when an unrecognized option is found in the argument list or when an option requiring an argument is given none.

The argument to the exception is a string indicating the cause of the error. The attributes msg and opt give the error message and related option.

Example

Suppose we want to pass two file names through command line and we also want to give an option to check the usage of the script. Usage of the script is as follows −

usage: test.py -i <inputfile> -o <outputfile>

Here is the following script to test.py −

#!/usr/bin/python3

import sys, getopt

def main(argv):
   inputfile = ''
   outputfile = ''
   try:
      opts, args = getopt.getopt(argv,"hi:o:",)
   except getopt.GetoptError:
      print ('test.py -i <inputfile> -o <outputfile>')
      sys.exit(2)
   for opt, arg in opts:
      if opt == '-h':
         print ('test.py -i <inputfile> -o <outputfile>')
         sys.exit()
      elif opt in ("-i", "--ifile"):
         inputfile = arg
      elif opt in ("-o", "--ofile"):
         outputfile = arg
   print ('Input file is "', inputfile)
   print ('Output file is "', outputfile)

if __name__ == "__main__":
   main(sys.argv)

Output

Now, run the above script as follows −

$ test.py -h
usage: test.py -i <inputfile> -o <outputfile>

$ test.py -i BMP -o
usage: test.py -i <inputfile> -o <outputfile>

$ test.py -i inputfile -o outputfile
Input file is " inputfile
Output file is " outputfile

python_basic_syntax.htm

Previous Page
Print Page

Next Page  

Summary

UNIX users have always depended on command-line arguments to modify the behavior of programs, especially utilities designed to be used as part of the collection of small tools that is the UNIX shell environment. Programs need to be able to handle options and arguments quickly, and without wasting a lot of the developer’s time. After all, few programs are designed to simply process command-line arguments, and the developer would rather be working on whatever the program really does.

The function is a standard library call that lets you loop over a program’s command-line arguments and detect options (with or without arguments attached to them) easily using a straightforward while/switch idiom. Its cousin, , lets you handle the more descriptive long options with almost no additional work, which is something that makes developers very happy.

Now that you’ve seen how to easily handle command-line options, you can concentrate on improving your program’s command line by adding support for long options, and by adding any additional options you might have been putting off because you didn’t want to add additional command-line option handling to your program.

Don’t forget to document all of your options and arguments somewhere, and to provide a built-in help function of some sort to help remind forgetful users.

Downloadable resources

  • PDF of this content
  • Sample getopt() program (au-getopt_demo.zip | 23KB)
  • Sample getopt_long() program (au-getopt_long_demo.zip | 24KB)

Related topics

  • «C/C++ development with the Eclipse Platform» (developerWorks, March 2006): Learn how to use C++ with Eclipse.
  • IBM trial software: Build your next development project with software for download directly from developerWorks.

Сложная обработка команды: getopt_long()

В 1990-х годах UNIX-приложения начали поддерживать длинные опции или параметры: два дефиса вместо одного дефиса (который использовался в нормальных, коротких параметрах), содержательное имя параметра и, по необходимости, аргумент, привязанный к параметру при помощи знака «равно».

К счастью, реализовать в программе поддержку длинных опций можно с помощью . является версией , которая поддерживает длинные опции в дополнение к коротким.

Функция принимает дополнительные параметры, одним из которых является указатель на массив структур . Как видно из , эта структура достаточно простая.

Листинг 10. Структура option для
struct option {
    char *name;
    int has_arg;
    int *flag;
    int val;
};

Поле является указателем на длинное имя опции без двойного дефиса. Поле может принимать значения , или (все эти значения определены в ) для указания, имеет ли эта опция аргумент или нет. Если переменная не установлена в , то, как только в ходе процесса обработки команды появится указанная опция, числу типа , на которое указывает переменная , будет присвоено значение переменной . Если установлен в , значение будет возвращено функцией при обнаружении ею рассматриваемой опции; если настроить на работу с короткими аргументами, функцию можно использовать без дополнительного кода – функция , которая содержит цикл и блок, автоматически обработает эту опцию.

Теперь программа станет более гибкой, поскольку опции смогут иметь необязательные аргументы

Более важно, что код, указанный выше, очень легко внедрить в уже написанную программу

Давайте посмотрим, как внедрение в тестовую программу изменит ее (проект можно найти в разделе ).

EXAMPLES

Consider below bash script

#!/bin/bash

# read the options
TEMP=`getopt -o f:s::d:a:: --long file-name:,source::,destination:,action:: -- "$@"`
eval set -- "$TEMP"

# extract options and their arguments into variables.
while true ; do
    case "$1" in
        -f|--file-name)
            fileName=$2 ; shift 2 ;;
        -s|--source)
            case "$2" in
                "") sourceDir='.' ; shift 2 ;;
                 *) sourceDir=$2 ; shift 2 ;;
            esac ;;
        -d|--destination)
            destinationDir=$2 ; shift 2;;
        -a|--action)
            case "$2" in
                "copy"|"move") action=$2 ; shift 2 ;;
                            *) action="copy" ; shift 2 ;;
            esac ;;
        --) shift ; break ;;
        *) echo "Internal error!" ; exit 1 ;;
    esac
done

# Now take action
echo "$action file $fileName from $sourceDir to $destinationDir"

This script has below 4 options long as well as short:

-f or --file-name: Parameter is mandatory (indicated by :)
-s or --source: Parameter is options (indicated by ::), default current directory
-d or --destination: Parameter is mandatory
-a or --action (Copy): Parameter is options default is copy

2. Specify all long options

$ ./getopt.sh --file-name MyFile.txt --source=/home --dest /home/test --act=Move
Move file MyFile.txt from /home to /home/test

3. Omit optional parameters

$ ./getopt.sh --file-name MyFile.txt --source=/home --dest /home/test --act
Copy file MyFile.txt from /home to /home/test

4. Omit mandatory parameter

$ ./getopt.sh --file-name MyFile.txt --source=/home -a Move -d
getopt: option requires an argument -- d
Copy file MyFile.txt from /home to

Previous Page
Print Page

Next Page  

Introducing getopts

There is a convenient utility which parses these options for you; it is called , and whilst its usage can feel a
little strange, using this technique will allow your scripts to process options in a standardised and familiar-feeling way.

The first argument you pass to is a list of which letters (or numbers, or any other single character)
it will accept. Each character, if followed by a colon, is expected to be followed an argument, like the example
above. always has to be followed by the name of a tar file. This option argument is passed to your script in the variable.

will also set the variable for you; we will deal with that .

The second argument that you pass to is the name of a variable which will be populated with the character of the current
switch. Often, this is called or just , although it can have any name you choose.

This example script can save/restore files (as a tarball) and a database. You must pass it either (Save) or
(Restore). If you pass then it will use that name to dump (or restore) the database; if you pass ,
it will use that name for the tarball to create (or extract) the files.

There are a few things this first draft of the script doesn’t deal with; passing both and is invalid. If you do,
this script will take the last thing you said, so will Save (not Restore) since the last thing
it processed was the Save command.

Similarly, if you don’t pass at least one of and , then nothing will happen at all.

The reason for the is that the script does not want to be influenced by any environment
variables which may be already set. Note that this will only affect the scope of the running script; the calling shell won’t have its
variables changed.

For brevity, I have not defined the , , and
functions here; the downloadable scripts do have dummy functions so that the scripts will actually run for you. They just
display what would be done, but don’t actually do anything to your files.

download this script (getopts1.sh)

#!/bin/bash

unset DB_DUMP TARBALL ACTION

while getopts 'srd:f:' c
do
  case $c in
    s) ACTION=SAVE ;;
    r) ACTION=RESTORE ;;
    d) DB_DUMP=$OPTARG ;;
    f) TARBALL=$OPTARG ;;
  esac
done

if ; then
  case $ACTION in
    SAVE)    save_database $DB_DUMP    ;;
    RESTORE) restore_database $DB_DUMP ;;
  esac
fi

if ; then
  case $ACTION in
    SAVE)    save_files $TARBALL    ;;
    RESTORE) restore_files $TARBALL ;;
  esac
fi

The command is an argument to a loop — each time through the loop, it processes the switch, and sets the
variable to the character of the switch. You can read more about loops and case
in the main tutorial.

If we call this script as: , it will process the , set
, and we run into the statement for the first time. This sees that , sets ,
and the at the end of that line tells it to stop processing, and it goes back to for the next run around
the loop. This reads , which logically doesn’t make sense (we can’t have it both save the backup and restore
the backup at the same time), but the script doesn’t know that, so it sets , the statement sets , and we go back to to process the next argument.

Now, sets and also sets , because the ‘d:’ in the
invocation tells it that is followed by an argument (the name of the database dump file). Execution goes on to the
statement, which sets . When we get in to the main body of the script, if the variable
has a value, then it will either save the database to that file, or restore it from that file.

The next option is , and the same process is followed; sets and also sets
. The statement reads these, and sets .

Finally, we passed it yet another switch, so it will change the variable back to .

When the main script starts, it checks if is set, then checks the value of , and either saves or
restores the database, using , according to the value of .

Similarly, it checks if has been set, and either saves or restores the files with as the argument.

Example¶

Below is a more complete example program which takes 5 options:
-o, -v, --output, --verbose, and --version. The
-o, --output, and --version options each require an
argument.

import getopt
import sys

version = '1.0'
verbose = False
output_filename = 'default.out'

print 'ARGV      :', sys.argv1:]

options, remainder = getopt.getopt(sys.argv1:], 'o:v', 'output=', 
                                                         'verbose',
                                                         'version=',
                                                         ])
print 'OPTIONS   :', options

for opt, arg in options
    if opt in ('-o', '--output'):
        output_filename = arg
    elif opt in ('-v', '--verbose'):
        verbose = True
    elif opt == '--version'
        version = arg

print 'VERSION   :', version
print 'VERBOSE   :', verbose
print 'OUTPUT    :', output_filename
print 'REMAINING :', remainder

The program can be called in a variety of ways.

$ python getopt_example.py

ARGV      : []
OPTIONS   : []
VERSION   : 1.0
VERBOSE   : False
OUTPUT    : default.out
REMAINING : []

A single letter option can be a separate from its argument:

$ python getopt_example.py -o foo

ARGV      : 
OPTIONS   : 
VERSION   : 1.0
VERBOSE   : False
OUTPUT    : foo
REMAINING : []

or combined:

$ python getopt_example.py -ofoo

ARGV      : 
OPTIONS   : 
VERSION   : 1.0
VERBOSE   : False
OUTPUT    : foo
REMAINING : []

A long form option can similarly be separate:

$ python getopt_example.py --output foo

ARGV      : 
OPTIONS   : 
VERSION   : 1.0
VERBOSE   : False
OUTPUT    : foo
REMAINING : []

or combined, with =:

总结

UNIX 用户始终依赖于命令行参数来修改程序的行为,特别是那些设计作为小工具集合 (UNIX 外壳环境)的一部分使用的实用工具更是如此。程序需要能够快速处理各个选项和参数,且要求不会浪费开发人员的太多时间。毕竟,几乎没有程序设计为仅处理命令行参数,开发人员更应该将精力放在程序所实际进行的工作上。

函数是一个标准库调用,可允许您使用直接的 while/switch 语句方便地逐个处理命令行参数和检测选项(带或不带附加的参数)。与其类似的 允许在几乎不进行额外工作的情况下处理更具描述性的长选项,这非常受开发人员的欢迎。

既然已经知道了如何方便地处理命令行选项,现在就可以集中精力改进您的程序的命令行,可以添加长选项支持,或添加之前由于不想向程序添加额外的命令行选项处理而搁置的任何其他选项。

不要忘记在某处记录您所有的选项和参数,并提供某种类型的内置帮助函数来为健忘的用户提供帮助。

下载资源

  • Sample getopt() program (au-getopt_demo.zip | 23KB)
  • Sample getopt_long() program (au-getopt_long_demo.zip | 24KB)

相关主题

  • “What is Eclipse, and how do I use it?”(developerWorks,2001 年 11 月):有关 Eclipse 的相关问题及答案,请一定阅读此文。
  • Get started now with Eclipse:下载并开始使用 Eclipse。
  • “C/C++ development with the Eclipse Platform”(developerWorks,2006 年 3 月):了解如何将 C++ 与 Eclipse 一起使用。
  • Network services:了解遗留设计与线程化设计的区别。
  • “Build UNIX software with Eclipse”(developerWorks,2006 年 3 月):阅读此文以了解使用 Eclipse 构建 UNIX 软件的基础知识。

命令行

在编写新程序时,首先遇到的障碍之一就是如何处理控制其行为的命令行参数。这包括从命令行传递给您程序的 函数的一个整数计数(通常名为 argc)和一个指向字符串的指针数组(通常名为 argv).可以采用两种实质一样的方式声明标注 函数,如 中所示。

清单 1. 声明 函数的两种方式
int main( int argc, char *argv[] );
int main( int argc, char **argv );

第一种方式使用的是指向 指针数组,现在似乎很流行这种方式,比第二种方式(其指针指向多个指向 的指针)略微清楚一些。由于某些原因,我使用第二种方式的时间更多一些,这可能源于我在高中时艰难学习 C 指针的经历。对于所有的用途和目的,这两种方法都是一样的,因此可以使用其中您自己最喜欢的方式。

当 C 运行时库的程序启动代码调用您的 时,已经对命令行进行了处理。 参数包含参数的计数值,而 包含指向这些参数的指针数组。对于 C 运行时库,arguments 是程序的名称,程序名后的任何内容都应该使用空格加以分隔。

例如,如果使用参数 运行一个名为 foo 程序,您的 argc 将设置为 4, 的设置情况将如 中所示。

清单 2. argv 的内容
argv - foo
argv - -v
argv - bar
argv - www.ibm.com

一个程序仅有一组命令行参数,因此我要将此信息存储在记录选项和设置的全局结构中。对程序有意义的要跟踪的任何内容都可以记录到此结构中,我将使用结构来帮助减少全局变量的数量。正如我在网络服务设计文章(请参阅)所提到的,全局变量非常不适合用于线程化编程中,因此要谨慎使用。

示例代码将演示一个假想的 doc2html 程序的命令行处理。该 doc2html 程序将某种类型的文档转换为 HTML,具体由用户指定的命令行选项控制。它支持以下选项:

  • ——不创建关键字索引。
  • ——转换为使用语言代码 指定的语言。
  • ——将经过转换的文档写入到 outfile.html,而不是打印到标准输出。
  • ——进行转换时提供详细信息;可以多次指定,以提高诊断级别。
  • 将使用其他文件名称来作为输入文档。

您还将支持 和 ,以打印帮助消息来提示各个选项的用途。

CONFORMING TO top

       getopt():
              POSIX.1-2001, POSIX.1-2008, and POSIX.2, provided the
              environment variable POSIXLY_CORRECT is set.  Otherwise, the
              elements of argv aren't really const, because these functions
              permute them.  Nevertheless, const is used in the prototype to
              be compatible with other systems.

              The use of '+' and '-' in optstring is a GNU extension.

              On some older implementations, getopt() was declared in
              <stdio.h>.  SUSv1 permitted the declaration to appear in
              either <unistd.h> or <stdio.h>.  POSIX.1-1996 marked the use
              of <stdio.h> for this purpose as LEGACY.  POSIX.1-2001 does
              not require the declaration to appear in <stdio.h>.

       getopt_long() and getopt_long_only():
              These functions are GNU extensions.

Conforming To

getopt():

POSIX.2 and POSIX.1-2001, provided the environment variable POSIXLY_CORRECT is set. Otherwise, the elements of argv aren’t really const,
because we permute them. We pretend they’re const in the prototype to be compatible with other systems.

The use of ‘+’ and ‘-‘ in optstring is a GNU extension.

On some older implementations, getopt() was declared in . SUSv1 permitted the declaration to appear in either
or . POSIX.1-2001 marked the use of for this purpose as LEGACY. POSIX.1-2001 does not
allow the declaration to appear in .

getopt_long() and getopt_long_only():
These functions are GNU extensions.

Function Arguments¶

The getopt function takes three arguments:

  • The first argument is the sequence of arguments to be parsed. This
    usually comes from sys.argv (ignoring the program name in
    sys.arg).
  • The second argument is the option definition string for single character
    options. If one of the options requires an argument, its letter is followed
    by a colon.
  • The third argument, if used, should be a sequence of the long-style
    option names. Long style options can be more than a single
    character, such as --noarg or --witharg. The option names in
    the sequence should not include the -- prefix. If any long
    option requires an argument, its name should have a suffix of =.

Python getopt

  • Python getopt module is very similar in working as the C function for parsing command-line parameters.
  • As this function is similar to C function and Unix getopt() function, users familiar with those conventions will find it very easy to use Python getopt module functions.

If you want a simpler module to parse command-line parameters, try argparse.

Python getopt function

is the first function provided by the module with same name.

It parses the command line options and parameter list. The signature of this function is mentioned below:

Its arguments includes:

  • are the arguments to be passed.
  • is the options this script accepts.
  • Optional parameter, is the list of String parameters this function accepts which should be supported. Note that the should not be prepended with option names.

Let us study this function using some examples.

Python getopt example

Now this will be tricky at first glance. We will see a simple example which will be tricky but we will explain things afterwards.

Here is the code snippet:

In this example, we simply accepted some arguments. Before running the script, let’s establish our understanding on what happened here:

  • In , we are using starting index as 1 as is the name of the script that we’re running which we need not access in our script.
  • Now, the getopt function takes in three parameters:
    the command-line argument list which we get from , a string containing all accepted single-character command-line options that the script accepts, and a list of longer command-line options that are equivalent to the single-character versions.
  • If anything wrong happens with the call, we can also catch the Exception and handle it gracefully. Here, we just exited the script execution.
  • As in any other commands in any operating system, it is wise to print details when a user runs a script incorrectly.

So, what does the means? See here:

The first and last options are defaults. We use a custom option as , notice the colon? Colon means that this option can get any type of value. Finally, the single-character versions are same as longer versions, is same as . You can mention any one.

Let’s run the script now:

So, this collected the options and arguments in separate lists. The best part about is that it allows us to gracefully manage any possible exceptions:

About the flag, there is an important point to note. The flag must always be provided with an additional argument, exactly like the flag. This is described by an equals sign in .

Python gnu_output() for GNU style Parsing

In Python 2.3, another function was added in module called . This function is very similar to the original function except the fact that by default, GNU style scanning is used. Let’s see an example script on how to use this function:

Before we establish an understanding, let’s run this script:

We can even try running this script without any arguments:

This describes default values which are assigned to values when no arguments are passed.

Don’t forget to try the argparse module if you want more flexibility.

In this lesson, we learned about various ways through which we can manage the command-line parameters with getopt module in Python.

Reference: API Doc

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *

Adblock
detector