introduction
I manually do some stuff like
wget -r -p -np -k http://ics.nju.edu.cn/~jyywiki/
Many of the time you may get into the scenario the page you scripy from the website. they are rendered by the js
. Admittedly, you can continue to use request_html
. The idea is to use Chroium core
to dynamically render the js page and grab the important information.
from requests_html import HTMLSession
If you want to deploy them locally, you have to get the express
.
var express = require('express');
var path = require('path');
var ejs = require('ejs');
//import the package here
var app = express();
// view engine setup
app.set('views', path.join(__dirname, '/wiki'));
app.engine('html', require('ejs').__express);
app.set('view engine', 'html');
//youcan implement the function used in the cache page here.
router.get('/*', function(req, res, next) {
res.type('html');
res.render('*');
});
//credit: https://blog.csdn.net/u011481543/article/details/79696565
node server.js
Save it to the server.js with the relative path and run