[Dowser] is a(nother) fast, multi-threaded, recursive file-finding library for Unix/Rust

It differs from Walkdir might make more sense

Dowser

[Dowser] is a(nother) fast, multi-threaded, recursive file-finding library for Unix/Rust. It differs from Walkdir and kin in a number of ways:

  • It is not limited to one root; any number of file and directory paths can be loaded and traversed en masse;
  • Symlinks and hidden directories are followed like any other, including across devices;
  • Matching file paths are canonicalized, deduped, and collected into a Vec<PathBuf>;

If those things sound nice, this library might be a good fit.

On the other hand, [Dowser] is optimized for just one particular type of searching:

  • File paths can be filtered via [Dowser::filtered] or [Dowser::regex], but directory paths cannot;
  • There are no settings for things like min/max depth, directory filtering, etc.;
  • It only returns file paths. Directories are crawled, but not returned in the set;
  • File uniqueness hashing relies on Unix metadata; this library is not compatible with Windows;

Depending on your needs, those limitations could be bad, in which case something like Walkdir might make more sense.

Installation

Add dowser to your dependencies in Cargo.toml, like:

Features

Feature Description
regexp Enable the [Dowser::regex] method, which allows for matching file paths (as bytes) against a regular expression.

To use this feature, alter the Cargo.toml bit to read:

Example

This crate comes with two ways to find files. If you already have the full list of starting path(s) and just want all the files that exist under them, use the dowse method:

If you want to filter files or need to add path(s) to the crawl list multiple times, initialize a [Dowser] object with one of the following three methods:

  • [Dowser::default]: Return all files without prejudice.
  • [Dowser::filtered]: Filter file paths via the provided callback.
  • [Dowser::regex]: Filter file paths via regular express. (This requires enabling the regexp crate feature.)

From there, add one or more file or directory paths using the [Dowser::with_path] and [Dowser::with_paths] methods.

Finally, collect the results with Vec::<PathBuf>::try_from(). If no files are found, an error is returned, otherwise the matching file paths are collected into a vector.

License

See also: CREDITS.md

Copyright © 2021 Blobfolio, LLC <[email protected]>

This work is free. You can redistribute it and/or modify it under the terms of the Do What The Fuck You Want To Public License, Version 2.

DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
Version 2, December 2004

Copyright (C) 2004 Sam Hocevar <[email protected]>

Everyone is permitted to copy and distribute verbatim or modified
copies of this license document, and changing it is allowed as long
as the name is changed.

DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

0. You just DO WHAT THE FUCK YOU WANT TO.
Versions

Find the latest versions by id

v0.5.2 - Jun 18, 2022

Misc

  • Update dependencies.

v0.5.1 - May 28, 2022

Fixed

  • Files could be erroneously skipped when crossing filesystem boundaries

Removed

  • Feature parking_lot_mutex (std::sync::Mutex will be faster in Rust 1.62)

v0.5.0 - May 28, 2022

Yanked: This release incorrectly detected device changes. Use 0.5.1+.

This release removes DirConcurrency and related methods.

Parallel directory reads are now automatic and mandatory, but the inner loops — reading/filtering the contents of those directories — are now executed serially (within each parallel thread), greatly reducing the number of concurrently open file handles and subsequent risk of hitting ulimit ceilings.

The file collision (uniqueness filters) have also been greatly improved, further reducing the number of syscalls and overall search times.

v0.4.7 - May 19, 2022

Changed

  • Lock third-party dependency versions
  • Faster parallel iteration
  • Lower DirConcurrency::Sane from threads - 1 to threads / 2

v0.4.6 - Apr 17, 2022

Added

  • Extension::codegen (compile-time helper)
  • Extension::slice_ext

v0.4.5 - Mar 30, 2022

Changed

  • Replace hasher with dactyl::NoHash

v0.4.4 - Mar 28, 2022

Added

  • impl From<&OsStr>
  • impl From<&str>
  • impl From<&String>
  • impl From<String>

Deprecated

  • DirConcurrency::Other (prefer DirConcurrency::Custom)

Changed

  • DirConcurrency::Single now does all processing in serial

v0.4.3 - Mar 26, 2022

Added

  • impl Clone for Dowser
  • Dowser::into_vec
  • Dowser::with_dir_concurrency

v0.4.2 - Mar 08, 2022

Changes:

  • Minor performance improvements.

v0.3.6 - Jan 30, 2022

Changed:

  • Update dependencies;
  • Fix feature-dependent doctests;
  • Make parking_lot dependency optional (but still default);
  • Replace flume with crossbeam-channel;

Deprecated:

  • utility::du

v0.3.5 - Dec 31, 2021

Changes:

  • New Dowser::with_capacity;
  • New Dowser::with_capacity_and_filter;
  • New Dowser::shallow;
  • Use parking_lot and flume for slightly faster processing.

v0.3.4 - Dec 22, 2021

Changes:

  • New Dowser::par_without_paths;
  • Minor performance improvements;
  • Minor doc improvements;

v0.3.3 - Dec 21, 2021

Changes:

  • New Dowser::without_paths;
  • New Dowser::without_path;

v0.3.2 - Dec 15, 2021

Changes:

  • Improved path deduplication;
  • New Dowser::into_vec method;
  • Added From impls for owned PathBuf collections and singular Path/PathBuf types

v0.3.1 - Dec 15, 2021

Changes:

  • Doc improvements;
  • Deprecate dowser::dowse;

v0.3.0 - Oct 21, 2021

Changed

  • Use Rust edition 2021.

v0.2.4 - Jun 01, 2021

This release adds PartialEq<AsRef<Path>> for Extension, useful shorthand when needing to verify a single path has a single extension.

v0.2.3 - May 07, 2021

Changes:

  • This release includes updated dependency versions.

v0.2.2 - Apr 21, 2021

Changes:

  • New Extension match helper;

v0.2.1 - Apr 17, 2021

Changes

  • Eliminate unsafe {} block;
  • Minor doc/CI changes;

v0.2.0 - Mar 07, 2021

Changes:

  • Replace Dowser::with_filter with Dowser::filtered
  • Replace Dowser::with_regex with Dowser::regex
  • Replace Dowser::build with Vec::<PathBuf>::try_from
  • Improve documentation

v0.1.1 - Feb 23, 2021

Changes:

  • Streamline file hashing;
  • Eliminate unsafe {} block;

v0.1.0 - Feb 22, 2021

Initial release!

Information - Updated Jun 22, 2022

Stars: 1
Forks: 0
Issues: 0

Repositories & Extras

The arkworks ecosystem consist of Rust libraries for designing and working with zero knowledge succinct...

This library is released under the MIT License and the Apache v2 License (see License)

The arkworks ecosystem consist of Rust libraries for designing and working with zero knowledge succinct...
Http

740

A general purpose library of common HTTP types

More information about this crate can be found in the MIT license (LICENSE-MIT or

A general purpose library of common HTTP types

A library providing asynchronous, multiplexed tailing for (namely log) files

Also available is the underlying file event-stream (driven by MIT license (LICENSE-MIT or

A library providing asynchronous, multiplexed tailing for (namely log) files

Threshold Secret Sharing

Efficient pure-Rust library for MIT license (LICENSE-MIT or

Threshold Secret Sharing

unrust / uni-snd

This library is a part of MIT license (LICENSE-MIT or

unrust / uni-snd

CBOR Event library

MIT license (LICENSE-MIT or

CBOR Event library

Library for traversing &amp; reading GameCube and Wii disc images

Based on the C++ library MIT license (LICENSE-MIT or

Library for traversing &amp; reading GameCube and Wii disc images

Bindings to libssh

Bindings to GNU Library (or: Lesser) General Public License

Bindings to libssh

SocketCAN based experimental library that implements proposed (PR) embedded-hal CAN traits

SocketCAN based experimental library that implements proposed (MIT license (LICENSE-MIT or socketcan-rs

SocketCAN based experimental library that implements proposed (PR) embedded-hal CAN traits

daemonize is a library for writing system daemons

Inspired by the Python library MIT license (LICENSE-MIT or

daemonize is a library for writing system daemons

Rust library for low-level abstraction of MIPS32 processors

This project is licensed under the terms of the MIT license

Rust library for low-level abstraction of MIPS32 processors

A pure Rust library for reading/writing Windows

A pure Rust library for reading/writing License

A pure Rust library for reading/writing Windows
Facebook Instagram Twitter GitHub Dribbble
Privacy