gimli is a blazing fast library for consuming the

Zero copy: everything is just a reference to the original input buffer

gimli

gimli is a blazing fast library for consuming the DWARF debugging format.

  • Zero copy: everything is just a reference to the original input buffer. No copies of the input data get made.

  • Lazy: you can iterate compilation units without parsing their contents. Parse only as many debugging information entry (DIE) trees as you iterate over. gimli also uses DW_AT_sibling references to avoid parsing a DIE's children to find its next sibling, when possible.

  • Cross-platform: gimli makes no assumptions about what kind of object file you're working with. The flipside to that is that it's up to you to provide an ELF loader on Linux or Mach-O loader on macOS.

    • Unsure which object file parser to use? Try the cross-platform object crate. See the examples/ directory for usage with gimli.

Install

Add this to your Cargo.toml:

The minimum supported Rust version is 1.42.0.

Documentation

  • Documentation on docs.rs

  • Example programs:

    • A simple .debug_info parser

    • A simple .debug_line parser

    • A dwarfdump clone

    • An addr2line clone

    • ddbug, a utility giving insight into code generation by making debugging information readable.

    • dwprod, a tiny utility to list the compilers used to create each compilation unit within a shared library or executable (via DW_AT_producer).

    • dwarf-validate, a program to validate the integrity of some DWARF and its references between sections and compilation units.

License

Licensed under either of

  • Apache License, Version 2.0 (LICENSE-APACHE or https://www.apache.org/licenses/LICENSE-2.0)
  • MIT license (LICENSE-MIT or #404)

at your option.

Contribution

See CONTRIBUTING.md for hacking.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Issues

Collection of the latest Issues

mstange

mstange

4

EhHdrTable::lookup takes an address and finds the entry in the table whose address is closest to the looked up address. Then it drops that entry's address on the floor and only returns the "FDE pointer" part of the entry.

I think it would be great if both parts of the entry were returned: The entry address and the entry FDE pointer.

I am interested in the address as an approximation for "the closest function start address". I could of course get that address by parsing the eh_frame section and looking up the FDE, but this seems like a waste if what I'm looking for is already in eh_frame_hdr.

akirilov-arm

akirilov-arm

3

Currently (as of commit 6edacd2b59a1c586d971d6eaa7929fb26e2aa9c7) when writing Common Information Entries there is no way to set an augmentation string directly; there is only a way to use a limited set of values indirectly. Support for setting CIE augmentation strings is important when using the Memory Tagging and/or Pointer Authentication extensions to the Arm architecture.

philipc

philipc

0

Even though the DWARF 5 standard says "The value 0 indicates that no source file has been specified.", discussion on the DWARF mailing list indicates that this was an oversight. Additionally, LLVM does emit DW_AT_decl_file = 0 (and #606 changes dwarfdump to handle this). This change would mean we also don't need to duplicate the compilation file in the line table.

See 1f7de25 for the original implementation of the file index 0 handling.

philipc

philipc

enhancement
0

Errors that have been seen in the wild, and thus likely to be useful to check:

  • DW_AT_abstract_origin points to a DIE with the wrong tag (e.g. DW_TAG_inlined_subroutine should point to DW_TAG_subprogram)
  • DW_AT_call_file has invalid file index
RReverser

RReverser

5

When modifying source locations in the binary (like in Wasm use-cases), it's desirable to rewrite all of the debugging information to match the new addresses.

Dwarf::from accepts convert_address, which seems like a good place for doing such rewriting, but currently when it reaches the LineProgram conversion, it only executes the callback for the base address (SetAddress) instruction and not for individual offsets after that, making granular per-instruction address updates impossible.

One way to work around this issue would be to get mutable access to the instructions property of gimli::write::LineProgram and walk over instructions and updates offsets manually, but currently that property isn't exposed either.

What would be the best way to allow such granular rewrites? Should we call convert_address on each individual instruction and recalculate offsets, or would it be better to add a separate callback to the API, or expose the instructions mutable iterator?

philipc

philipc

5

When parsing units in various sections, we return an error if the version is unknown, which terminates iteration. This means we operate poorly with unknown DWARF versions (e.g this occurs for .debug_aranges in #559). I think instead we should allow users to write a loop which only terminates for an error that prevents further iteration.

This is how you currently iterate over units, and I don't see how to modify this to allow skipping some or all errors (which is expected, because that was the point of fallible iterators):

Instead, I think we need to allow these:

This is using a normal Iterator instead of FallibleIterator. Note that this change is only required for iterators that can continue iterating after an error, so it would not be required for DebuggingInformationEntry::attrs(), which is where we first started using fallible iterators. So this would be a reversal of the decision in #44.

Alternatively, we could silently skip units with unknown versions. I don't think this is ideal, but maybe this is ok if this is the behaviour that most users want.

This would be a large breaking change. Thoughts @fitzgen ?

vaibspider

vaibspider

6

Hi, I am trying to build a DWARF expression using gimli, which involves extracting a range of bits from an xmm register e.g. 32-63 bits from xmm0. But I found that the DWARF4 standard, section 2.5.1 mentions :

Each general operation represents a postfix operation on a simple stack machine. Each element of the stack is the size of an address on the target machine

So it seems that - if we have a 32-bit machine, value of an 128-bit register such as xmm0 would be truncated to 32 bits. Could you please confirm this and let me know if I'm missing something? Thanks!

philipc

philipc

0

In #514/#515, parsing failed because some attributes used DW_FORM_GNU_strp_alt, but no supplementary object file sections were loaded. The error value (UnexpectedEof(ReaderOffsetId(4497503920))) made it hard to determine the cause. We should return a better error in this case.

vaibspider

vaibspider

1

Hi, I am trying to update a location list present in a relocatable object file. I noticed that gimli::write::LocationListTable only has an add() method. I am able to add a new location list with this interface. But, I don't see an interface to get a mutable location list for an offset from this LocationListTable, which we can update.

I also searched for a way to get a readable instance of Location List and then convert it into a writable form, but couldn't find such interface. (I can get LocListIter with gimli::read::Dwarf::attr_locations(), but only for reading)

Am I missing something? Thanks!

philipc

philipc

2

A couple of issues were encountered in #448 and https://github.com/bjorn3/rustc_codegen_cranelift/pull/978:

  • subprograms need a DW_AT_name attribute
  • DW_AT_high_pc can only be a length for DWARF version >= 4
  • base addresses in location lists are new in DWARF version 3

And there'll be a whole bunch of other things that the standard requires or that are added in newer versions which we don't check for at all.

Some ideas for how to help with this:

  • add these sorts of checks to the dwarf-validate example
  • add a validate method to UnitTable or Unit

Ideally the API would disallow these sorts of errors, but it is too low level for this. Maybe we should provide a higher level API too.

philipc

philipc

0

My strategy for handling relocatable addresses so far has been to override Reader::read_address/read_offset to store the relocation in a map and return an index into that map instead of the real address, and later convert it back when consuming the address. This doesn't work for parse_encoded_pointer because it doesn't use read_address/read_offset for relative pointers, and even if it did my strategy wouldn't work because parse_encoded_pointer needs to add the address to a base address.

I think the only fix for this is something like #409.

vaibspider

vaibspider

22

I am using Dwarf::from() to convert a readable Dwarf instance to a writable one using convert_address() function as follows:-

And I'm getting an InvalidAttributeValue error on the Dwarf::from line. As mentioned in Dwarf::from()'s documentation,

convert_address is a function to convert read addresses into the Address type. For non-relocatable addresses, this function may simply return Address::Constant(address). For relocatable addresses, it is the caller's responsibility to determine the symbol and addend corresponding to the address and return Address::Symbol { symbol, addend }.

I'm simply returning Address::Constant(addr) in convert_address function. Could someone please clarify the changes to be done for relocatable addresses in convert_address function? (Is convert_address the cause for this error?)

Thanks

philipc

philipc

0

The current handling of DW_FORM_ref_addr assumes that you are writing all of the compilation units at once, so that it can calculate the section offset itself, but that is atypical behaviour. Instead, the producer will normally need to generate a symbol and relocation so that the reference can be resolved during static linking.

Fixing this will need a means to assign a symbol to the reference, and a means to assign a symbol to an entry that may be referred to.

This will also apply to DW_OP_call_ref in expressions.

luser

luser

0

I was looking up info on GDB's .gdb_index section today and found that it has been superseded by the DWARF 5 .debug_names section. gdb ships a helper tool (gdb-add-index) that can generate and add a .gdb_index section to a binary and in recent versions (I'm using gdb 8.3 on Ubuntu) it can instead generate .debug_names:

The .debug_names section format is specified in the DWARF 5 spec section 6.1.1, "Lookup by Name".

philipc

philipc

enhancement
0

Now that the Error trait has deprecated description (see #462), we shouldn't bother with read::Error::description at all, and it should be moved into the impl Display.

Since this also means we no longer need to return &str, we should improve the displayed error messages to include any additional information they contain.

mitsuhiko

mitsuhiko

2

I'm playing around with various ways at the moment to make it easier to address file contents and DWARF 5 is helping with DW_LCNT_MD5 here already. It lets you address a source file by MD5 hash which is pretty helpful but it still requires a separate system to actually resolve the sources.

I came across an LLVM extension recently (DW_LNCT_LLVM_source) which LLVM added to support debugging of generated GPU programs. I feel like this would be also solving my issue quite well. However I'm not sure how gimli thinks about LLVM extensions.

The reason I'm asking is because I wanted to use gimli's write interface for this but currently the logic that emits these DW_LNCT_ attributes is internal to the structs so it requires a patch to gimli to emit new ones at the moment.

jsalzbergedu

jsalzbergedu

11

I have a 16 bit dos exe (mz format) that I'd like to extract line number information from. It seems to maybe be dwarf, as it has all of those .debug_info .debug_abbrev etc sections, and it was compiled with watcom which uses dwarf. However, I got lost in the examples, and I cant figure out exactly what is needed by gimli: I know the address where .debug_info starts and I know the header info and relocation table etc etc, is that enough? If so, whats a minimal example that I can get started with? Thanks.

PS: here's what the file looks like:

PPS: object is incapable of loading this kind of file

Information - Updated Jun 22, 2022

Stars: 589
Forks: 77
Issues: 53

Repositories & Extras

A cross-platform GUI library for Rust focused on simplicity and type-safety

Cross-platform support (Windows, macOS, Linux, and text inputs, Debug overlay with performance metrics

A cross-platform GUI library for Rust focused on simplicity and type-safety
Misc

248

A CLI tool to easily get a new project up and running by using pre-made...

A rust cross platform rust boilerplate template to get up and running quickly

A CLI tool to easily get a new project up and running by using pre-made...

HAL : Hyper Adaptive Learning

Rust based Cross-GPU Machine Learning

HAL : Hyper Adaptive Learning

A crossplatform Rust bindings for the soloud audio engine library

Supported formats: wav, mp3, ogg, flac

A crossplatform Rust bindings for the soloud audio engine library

Safe wrapper around SPIR-V Cross

Safe wrapper around SPIRV-Cross for use with Rust

Safe wrapper around SPIR-V Cross

🐏 rpmalloc-rs

Cross-platform Rust global memory allocator using rpmalloc README for a detailed description of how the allocator works, peforms, and compares with other allocators

🐏 rpmalloc-rs

Rust crate providing cross-platform information about the notebook batteries

battery provides a cross-platform unified API to a notebook batteries state

Rust crate providing cross-platform information about the notebook batteries

rust-clipboard is a cross-platform library for getting and setting the contents of the OS-level clipboard

It has been tested on Windows, Mac OSX, GNU/Linux, and FreeBSD

rust-clipboard is a cross-platform library for getting and setting the contents of the OS-level clipboard

A cross-platform GUI library for Rust focused on simplicity and type-safety

Cross-platform support (Windows, macOS, Linux, and text inputs, Debug overlay with performance metrics

A cross-platform GUI library for Rust focused on simplicity and type-safety

A cross platform Rust library for efficiently walking a directory recursively

Comes with support for following symbolic links, controlling the number of

A cross platform Rust library for efficiently walking a directory recursively

debug-here: a cross platform rust debugger hook

Debuggers are a great way to examine the state of a program

debug-here: a cross platform rust debugger hook
Facebook Instagram Twitter GitHub Dribbble
Privacy