Skip to content

Latest commit

 

History

History
278 lines (245 loc) · 14.5 KB

File metadata and controls

278 lines (245 loc) · 14.5 KB

Automation and Advanced Features Tutorial

Objectives

This tutorial will teach you how to use source- and sink-generating analysis policies and combine them with others.

General concepts

So far, we have defined sources and sinks though program instrumentation. However, this is not the most practical solution as it requires knowledge of the code and of analyzed behaviours.

Colorstreams implements multiple analysis policies tackling this issue. In this tutorial, we will consider the following:

  • the autosource policy, which allows to generate sources through function stubs
  • the oobautosink policy, which automatically detects out-of-bounds memory operations and generates sinks accordingly
  • the compose policy, which redirects sources and sinks generated by producer analyses to consumer analyses

We will analyze target2, which is target without instrumentation. To build it, run:

make target2.run

Generating sources with autosource

The autosource policy allows to automatically generate sources through user-configured function stubs. These are called when entering the associated function and may perform a variety of actions, including some to be executed before returning. In particular, we will use the bytes_to_buf and ret_reg stubs, described as follows with colorstreams -autosource-list-source-stubs:

- bytes_to_buf(entry / ret, buf, size, source name, source desc):
	<size> first bytes of <buf> after entry or before return (format: [0-9]+ (argnum) | retval | gdb<gdb expr> | ret<expr> (evaluated before return) | strlen<expr>, with @[0-9]+ and @ret for arguments and the return value in gdb expressions)
- ret_reg(source desc):
	return value register

NOTE: it is currently not possible to call functions in GDB expressions. Doing so will result in a deadlock.

Conveniently, target2 has a get_input function which parses both the message and message size inputs:

uint64_t get_input(int argc, char **argv, char **msg)
{
    if(argc != 3)
    {   
        printf("Usage: target message message_size\n");
        exit(0);
    }   
    *msg = argv[1]; //*msg = message, no explicit size
    return atoi(argv[2]); //return value = message size
}

We will therefore attribute stubs to it in order to generate the correct sources:

  • get_input:bytes_to_buf(ret,ret<gdb<*((char**)@2)>>,ret<strlen<gdb<*((char**)@2)>>>,message,controlled) for the message
  • get_input:ret_reg(controlled) for the message size

Here is the full command line:

colorstreams -main main -p autosource -autosource-source-stubs "get_input:bytes_to_buf(ret,ret<gdb<*((char**)@2)>>,ret<strlen<gdb<*((char**)@2)>>>,message,controlled);get_input:ret_reg(controlled)" -args "\"(a\" 0" ./target2.run

Yielding the following output:

[COLORSTREAMS] Chosen policies: autosource
[Auto-Source <0> → SOURCE] get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg)
[Auto-Source <0> → SOURCE] message <1> <MEM<0x7fff7f7d6571>[2]>: controlled (auto bytes_to_buf)
[ERROR <1>] target crashed
[Auto-Source <0> → STATS]
Auto-Source <0> stats:
├─Processed function calls: 73
├─Processed instructions: 3585
├─Unique processed instructions: 1622
├─Sources: 0
├─Sinks: 0
├─Positive sinks: 0
└─Analysis runtime: 0.001s
[STATS]
General stats:
├─Warnings: 1
└─Total runtime: 7.696s
[DONE] 

The program crashes due to the large memcpy since we did not add pinstrio_abort before.

Generating sinks with oobautosink

The oobautosink policy generates sinks based on out-of-bounds memory operations detected by the oobcheck policy. In short, oobcheck tracks references to known objects and pointer aliases, then compares the ranges of memory reads and writes to the object from which the base address was derived. It can also detect use-after-frees.

In target2, vulnerabilities (a) and (b) from the quickstart tutorial both result in buffer overflow writes (and reads), which can be detected by oobcheck. oobautosink will then generate sinks for the base address, size and written data. In this particular case, the OOBs will be detected when memcpy is called, by oobcheck's default memcpy stub.

While we will call oobautosink, oobcheck can still be configured through its command-line options. In particular, the following can be useful:

  • -oobcheck-dont-check: do not check within the specified functions
  • -oobcheck-ak: terminate analysis automatically if a memory mapping violation is detected, to avoid a crash
  • -oobcheck-ignore-small-reads: ignore reads smaller than the specified size, which can be useful to eliminate spurious detections in vectorized library functions
  • -oobcheck-dont-check-locals: disables checking for references to local variables, which can significantly improve performance by reducing calls to GDB
  • -oobcheck-check-locals-only-in-func: only check local variable bounds for locals from specified functions
  • -oobcheck-check-reads-only-in-func: only check OOB reads in the specified functions

Note: checking local variable bounds induces many additional GDB queries, which can be very slow. In particular, communication between GDB and PIN is abnormally slow on some systems. Hence the importance of oobcheck-dont-check-locals and oobcheck-check-locals-only-in-func.

To generate sinks for vulnerability (a), run the following:

colorstreams -main main -p oobautosink -oobcheck-dont-check "_IO_printf;_IO_puts" -oobcheck-ak -args "\"(a\" 0" ./target2.run

This gives the following output:

[COLORSTREAMS] Chosen policies: oobautosink
[OOB Auto-Sink <0> → OOB Checker <1> → INFO] stopping on memory violation
[OOB Auto-Sink <0> → OOB Checker <1> → RESULT]
Detection 0: 
├─target: MEM<0x7ffd43b9658e>[4611686018427387903]
├─r/w: read
├─valid: False
├─reason: 
│ ├─type: memory mapping violation
│ └─description: multiple mappings overlap
└─location: 
  ├─??? : memcpy@plt : ???
  ├─/home/guilhem/dev/colorstreams/doc/tutorial/target2.c : get_msg : 37
  └─/home/guilhem/dev/colorstreams/doc/tutorial/target2.c : main : 60
[OOB Auto-Sink <0> → OOB Checker <1> → INFO] stopping on memory violation
[OOB Auto-Sink <0> → OOB Checker <1> → RESULT]
Detection 1: 
├─target: MEM<0x7ffd43b94460>[4611686018427387903]
├─r/w: write
├─valid: False
├─reason: 
│ ├─type: out-of-bounds
│ └─object: 
│   ├─kind: stack variable <buf>
│   ├─object: MEM<0x7ffd43b94460>[256]
│   └─id: 0x2
└─location: 
  ├─??? : memcpy@plt : ???
  ├─/home/guilhem/dev/colorstreams/doc/tutorial/target2.c : get_msg : 37
  └─/home/guilhem/dev/colorstreams/doc/tutorial/target2.c : main : 60
[OOB Auto-Sink <0> → SINK] Detection_1_OOB_write_base <0> <REG<rdi>[8]> (auto OOB base)
[OOB Auto-Sink <0> → SINK] Detection_1_OOB_write_size <1> <REG<rdx>[8]> (auto OOB size)
[OOB Auto-Sink <0> → SINK] Detection_1_OOB_write_wdata <2> <MEM<0x7ffd43b9658e>[4611686018427387903]> (auto OOB written data)
[OOB Auto-Sink <0> → SINK] Detection_0_OOB_read_base <3> <REG<rsi>[8]> (auto OOB base)
[OOB Auto-Sink <0> → SINK] Detection_0_OOB_read_size <4> <REG<rdx>[8]> (auto OOB size)
[OOB Auto-Sink <0> → STATS]
OOB Auto-Sink <0> stats:
├─Processed function calls: 72
├─Processed instructions: 3541
├─Unique processed instructions: 1585
├─Sources: 0
├─Sinks: 0
├─Positive sinks: 0
└─Analysis runtime: 4.003s
[STATS]
General stats:
├─Warnings: 0
└─Total runtime: 11.280s
[DONE] 

Note how large the Detection_1_OOB_write_wdata sink corresponding to the written data is. In practice, it is not feasible to analyze this amount of data. Therefore, oobautosink should be configure to limit the size of analyzed written data with the -oobautosink-data-lim option. In addition, analyzing individual data bytes can be preferable. This can be achieved with the -oobautosink-split-data option.

Combining policies

So far, we have seen how to automatically generate sources and sinks. However, this is not very useful if they cannot be redirected to other analyses. To that end, the compose policy allows to define producer policies generating sources and sinks with the -compose-producers option and consumer policies using them with -compose-consumers.

We can now combine autosource, oobautosink and any other policy, bytedep for example:

colorstreams -main main -p compose -compose-producers "autosource;oobautosink" -autosource-source-stubs "get_input:bytes_to_buf(ret,ret<gdb<*((char**)@2)>>,ret<strlen<gdb<*((char**)@2)>>>,message,controlled);get_input:ret_reg(controlled)" -oobautosink-data-lim 8 -oobcheck-dont-check "_IO_printf;_IO_puts" -oobcheck-ak -compose-consumers bytedep -args "\"(a\" 0" ./target2.run

Yielding the following output:

[COLORSTREAMS] Chosen policies: compose
[Compose <4> → Auto-Source <0> → SOURCE] get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg)
[Compose <4> → Auto-Source <0> → SOURCE] message <1> <MEM<0x7fff67932566>[2]>: controlled (auto bytes_to_buf)
[Compose <4> → Byte Dependency Taint Policy <3> → SOURCE] get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg)
[Compose <4> → Byte Dependency Taint Policy <3> → SOURCE] message <1> <MEM<0x7fff67932566>[2]>: controlled (auto bytes_to_buf)
[Compose <4> → OOB Auto-Sink <1> → OOB Checker <2> → INFO] stopping on memory violation
[Compose <4> → OOB Auto-Sink <1> → OOB Checker <2> → RESULT]
Detection 0: 
├─target: MEM<0x7fff6793258e>[4611686018427387903]
├─r/w: read
├─valid: False
├─reason: 
│ ├─type: memory mapping violation
│ └─description: multiple mappings overlap
└─location: 
  ├─??? : memcpy@plt : ???
  ├─/home/guilhem/dev/colorstreams/doc/tutorial/target2.c : get_msg : 37
  └─/home/guilhem/dev/colorstreams/doc/tutorial/target2.c : main : 60
[Compose <4> → OOB Auto-Sink <1> → OOB Checker <2> → INFO] stopping on memory violation
[Compose <4> → OOB Auto-Sink <1> → OOB Checker <2> → RESULT]
Detection 1: 
├─target: MEM<0x7fff67930ad0>[4611686018427387903]
├─r/w: write
├─valid: False
├─reason: 
│ ├─type: out-of-bounds
│ └─object: 
│   ├─kind: stack variable <buf>
│   ├─object: MEM<0x7fff67930ad0>[256]
│   └─id: 0x2
└─location: 
  ├─??? : memcpy@plt : ???
  ├─/home/guilhem/dev/colorstreams/doc/tutorial/target2.c : get_msg : 37
  └─/home/guilhem/dev/colorstreams/doc/tutorial/target2.c : main : 60
[Compose <4> → OOB Auto-Sink <1> → SINK] Detection_1_OOB_write_base <0> <REG<rdi>[8]> (auto OOB base)
[Compose <4> → OOB Auto-Sink <1> → SINK] Detection_1_OOB_write_size <1> <REG<rdx>[8]> (auto OOB size)
[Compose <4> → OOB Auto-Sink <1> → SINK] Detection_1_OOB_write_wdata <2> <(0->7 MEM<0x7fff6793258e>[4611686018427387903])> (auto OOB written data)
[Compose <4> → OOB Auto-Sink <1> → SINK] Detection_0_OOB_read_base <3> <REG<rsi>[8]> (auto OOB base)
[Compose <4> → OOB Auto-Sink <1> → SINK] Detection_0_OOB_read_size <4> <REG<rdx>[8]> (auto OOB size)
[Compose <4> → Byte Dependency Taint Policy <3> → SINK] Detection_1_OOB_write_base <0> <REG<rdi>[8]> (auto OOB base)
[Compose <4> → Byte Dependency Taint Policy <3> → SINK] Detection_1_OOB_write_size <1> <REG<rdx>[8]> (auto OOB size)
[Compose <4> → Byte Dependency Taint Policy <3> → RESULT]
Detection_1_OOB_write_size <1>: 
├─Sink: REG<rdx>[8]
└─Taint: 
  ├─REG<rdx>[8]: <D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7]>
  ├─Tainted registers: 2
  └─Tainted memory bytes: 50
[Compose <4> → Byte Dependency Taint Policy <3> → SINK] Detection_1_OOB_write_wdata <2> <(0->7 MEM<0x7fff6793258e>[4611686018427387903])> (auto OOB written data)
[Compose <4> → Byte Dependency Taint Policy <3> → SINK] Detection_0_OOB_read_base <3> <REG<rsi>[8]> (auto OOB base)
[Compose <4> → Byte Dependency Taint Policy <3> → SINK] Detection_0_OOB_read_size <4> <REG<rdx>[8]> (auto OOB size)
[Compose <4> → Byte Dependency Taint Policy <3> → RESULT]
Detection_0_OOB_read_size <4>: 
├─Sink: REG<rdx>[8]
└─Taint: 
  ├─REG<rdx>[8]: <D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7], D(get_input_ret <0> <REG<rax>[8]>: controlled (auto ret_reg))[0 - 7]>
  ├─Tainted registers: 2
  └─Tainted memory bytes: 50
[Compose <4> → STATS]
Compose <4> stats:
├─Processed function calls: 72
├─Processed instructions: 3541
├─Unique processed instructions: 1585
├─Sources: 0
├─Sinks: 0
├─Positive sinks: 0
└─Analysis runtime: 4.830s
[STATS]
General stats:
├─Warnings: 0
└─Total runtime: 12.036s
[DONE] 

Policy and option tagging

Now that we can combine different analysis policies, one issue arises: what about running multiple of the same analysis policy with different options? Colorstreams' solution to this problem is tagging.

Tags are keywords that can be defined with the command line option -tag. They become active for any subsequent option, which can be associated with a tag by adding a tag: prefix. For example, -oobautosink-data-lim becomes -tag:oobautosink-data-lim.

Tagged options only affect policies with the same tag, in priority over non-tagged options. Policies can be tagged by adding a :tag suffix to their name. For example, oobautosink becomes oobautosink:tag.

Analyses internally created by others (such as oobcheck in oobautosink) inherit their parent's tag.

Other automation policies

  • autosink generates sinks based on specified function stubs, similarly to autosource.
  • cfhautosink generates sinks for control-flow hijacking primitives (corruption of return addresses, function pointers...). It is based on taint via the bytedep policy and thus requires sources for controlled inputs.