项目作者: 2night

项目描述 :
Fully conformant HTML5 dom library with CSS4 selectors
高级语言: D
项目地址: git://github.com/2night/arrogant.git
创建时间: 2018-06-07T13:07:39Z
项目社区:https://github.com/2night/arrogant

开源协议:MIT License

下载


arrogant

Fully conformant HTML5 dom library with CSS4 selectors. Based on Modest.

Tested on Linux. Should work fine on OSX and Windows.

prerequisites: how to build & install modest

Modest is written in pure C, without any external dependency.
Just fetch source code and compile.

  1. git clone https://github.com/2night/arrogant.git
  2. cd arrogant
  3. git submodule update --init
  4. cd c/Modest
  5. make
  6. sudo make install
  7. sudo ldconfig

run an example

  1. dub -c arrogant_test_app

hello world

  1. import arrogant;
  2. import std.stdio : writeln, stdout;
  3. void main()
  4. {
  5. auto src = `<html><head></head><body><div>Hello World</div></body></html>`;
  6. auto arrogant = Arrogant();
  7. auto tree = arrogant.parse(src);
  8. // Change div content from "Hello World!" to "Hello D!"
  9. tree.byTagName("div").front.innerText = "Hello D!";
  10. // Print the edited html
  11. writeln(tree.document);
  12. assert(tree.document.innerHTML == "<html><head></head><body><div>Hello D!</div></body></html>");
  13. }

get data from webpage

  1. import arrogant;
  2. import std.net.curl;
  3. import std.stdio : writeln, stdout;
  4. void main()
  5. {
  6. auto src = "https://forum.dlang.org".get;
  7. auto arrogant = Arrogant();
  8. auto tree = arrogant.parse(src);
  9. size_t cnt = 0;
  10. writeln("Recent posts on forum.dlang.org:\n");
  11. // Search for summary divs
  12. foreach(post; tree.byClass("forum-index-col-lastpost"))
  13. {
  14. string title = post.byClass("forum-postsummary-subject").front["title"];
  15. string author = post.byClass("forum-postsummary-author").front["title"];
  16. string date = post.byCssSelector("span.forum-postsummary-time > span").front["title"];
  17. writeln("Title: ", title);
  18. writeln("By: ", author);
  19. writeln("Date: ", date);
  20. writeln("--------------");
  21. cnt++;
  22. }
  23. writeln("Total: ", cnt, " posts");
  24. }

more

Check this code or read documentation