ProductPromotion
Logo

Rust

made by https://0x3d.site

GitHub - Folyd/robotstxt: A native Rust port of Google's robots.txt parser and matcher C++ library.
A native Rust port of Google's robots.txt parser and matcher C++ library. - Folyd/robotstxt
Visit Site

GitHub - Folyd/robotstxt: A native Rust port of Google's robots.txt parser and matcher C++ library.

GitHub - Folyd/robotstxt: A native Rust port of Google's robots.txt parser and matcher C++ library.

robotstxt

Crates.io Docs.rs Apache 2.0

A native Rust port of Google's robots.txt parser and matcher C++ library.

  • Native Rust port, no third-part crate dependency
  • Zero unsafe code
  • Preserves all behavior of original library
  • Consistent API with the original library
  • 100% google original test passed

Installation

[dependencies]
robotstxt = "0.3.0"

Quick start

use robotstxt::DefaultMatcher;

let mut matcher = DefaultMatcher::default();
let robots_body = "user-agent: FooBot\n\
                   disallow: /\n";
assert_eq!(false, matcher.one_agent_allowed_by_robots(robots_body, "FooBot", "https://foo.com/"));

About

Quoting the README from Google's robots.txt parser and matcher repo:

The Robots Exclusion Protocol (REP) is a standard that enables website owners to control which URLs may be accessed by automated clients (i.e. crawlers) through a simple text file with a specific syntax. It's one of the basic building blocks of the internet as we know it and what allows search engines to operate.

Because the REP was only a de-facto standard for the past 25 years, different implementers implement parsing of robots.txt slightly differently, leading to confusion. This project aims to fix that by releasing the parser that Google uses.

The library is slightly modified (i.e. some internal headers and equivalent symbols) production code used by Googlebot, Google's crawler, to determine which URLs it may access based on rules provided by webmasters in robots.txt files. The library is released open-source to help developers build tools that better reflect Google's robots.txt parsing and matching.

Crate robotstxt aims to be a faithful conversion, from C++ to Rust, of Google's robots.txt parser and matcher.

Testing

$ git clone https://github.com/Folyd/robotstxt
Cloning into 'robotstxt'...
$ cd robotstxt/tests 
...
$ mkdir c-build && cd c-build
...
$ cmake ..
...
$ make
...
$ make test
Running tests...
Test project ~/robotstxt/tests/c-build
    Start 1: robots-test
1/1 Test #1: robots-test ......................   Passed    0.33 sec

License

The robotstxt parser and matcher Rust library is licensed under the terms of the Apache license. See LICENSE for more information.

More Resources
to explore the angular.

mail [email protected] to add your project or resources here 🔥.

Related Articles
to learn about angular.

FAQ's
to learn more about Angular JS.

mail [email protected] to add more queries here 🔍.

More Sites
to check out once you're finished browsing here.

0x3d
https://www.0x3d.site/
0x3d is designed for aggregating information.
NodeJS
https://nodejs.0x3d.site/
NodeJS Online Directory
Cross Platform
https://cross-platform.0x3d.site/
Cross Platform Online Directory
Open Source
https://open-source.0x3d.site/
Open Source Online Directory
Analytics
https://analytics.0x3d.site/
Analytics Online Directory
JavaScript
https://javascript.0x3d.site/
JavaScript Online Directory
GoLang
https://golang.0x3d.site/
GoLang Online Directory
Python
https://python.0x3d.site/
Python Online Directory
Swift
https://swift.0x3d.site/
Swift Online Directory
Rust
https://rust.0x3d.site/
Rust Online Directory
Scala
https://scala.0x3d.site/
Scala Online Directory
Ruby
https://ruby.0x3d.site/
Ruby Online Directory
Clojure
https://clojure.0x3d.site/
Clojure Online Directory
Elixir
https://elixir.0x3d.site/
Elixir Online Directory
Elm
https://elm.0x3d.site/
Elm Online Directory
Lua
https://lua.0x3d.site/
Lua Online Directory
C Programming
https://c-programming.0x3d.site/
C Programming Online Directory
C++ Programming
https://cpp-programming.0x3d.site/
C++ Programming Online Directory
R Programming
https://r-programming.0x3d.site/
R Programming Online Directory
Perl
https://perl.0x3d.site/
Perl Online Directory
Java
https://java.0x3d.site/
Java Online Directory
Kotlin
https://kotlin.0x3d.site/
Kotlin Online Directory
PHP
https://php.0x3d.site/
PHP Online Directory
React JS
https://react.0x3d.site/
React JS Online Directory
Angular
https://angular.0x3d.site/
Angular JS Online Directory