https://crlf.link/mem/offline/
Why
Digital content is ephemeral. Corporations have no monetary incentive to preserve things, so they don’t.
The Internet Archive is a great way to preserve things for the public good, but you can also keep an archive for yourself.
How
Archive.org
Download html>Pandoc to markdown
archivebox: See their Github community page for a long list of resources.
https://github.com/mozilla/readability Mozilla’s library to extract the main content from webpages