Documentation update! But the docs/index.md is still major WIP.

Rename to heckweasel.
General housekeeping update.
2023-08-05 12:09:20 -07:00 · 2023-04-18 17:31:25 -07:00 · 2023-03-06 16:22:10 -08:00 · 2021-12-19 22:02:47 -08:00 · 2021-06-30 00:39:50 -07:00 · 2021-04-28 23:09:35 -07:00
71 changed files with 841 additions and 448 deletions
--- a/2
+++ b/2
@ -1,6 +1,6 @@
 MIT License

-Copyright (c) 2018 Cas Rusnov
+Copyright (c) 2023 Cas Rusnov

 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
--- a/MANIFEST.in
+++ b/MANIFEST.in
@ -1 +1 @@
-include pixywerk2/defaults/*.yaml
+include heckweasel/defaults/*.yaml
--- a/README.md
+++ b/README.md
@ -1,5 +1,5 @@
-# Pixywerk #
+# Heckweasel #

-PixyWerk2 is a site compiler engineered like a metadata-based CMS with a template rendering system. Underneath it uses
+Heckweasel is a site compiler engineered like a metadata-based CMS with a template rendering system. Underneath it uses
 Jinja2 templates to provide programmability, and a structured metadata system, along with processors to convert
 user-friendly files such as Markdown and RST into HTML with templates.
--- a/TODO.md
+++ b/TODO.md
@ -1,8 +1,24 @@
 # TODO #

-* Pygments pretty printing of source code et al. including exposing that to the template API (`pygment_format(get_file_content('whatever.py'))`).
 * Smart CSS things (fill in the processors)
-
-# Maybe #
-
+* Project global defines, parameters.
+* pre- and post-scripts that will be run from __main__, either some shipped with heckweasel or project-level.
 * Library of template modules? ATOM et al.
+* Some off the shelf website templates and a template manager.
+* Live refreshing server thing which maps a heckweasel tree into a web server's memory and updates on change.
+* https://github.com/Python-Markdown/markdown/wiki/Third-Party-Extensions
+ * add markdown_link_attr_modifier extension
+ * add figureAltCaption extension
+ * add qrcode extension
+* Add support to define macros or whatever for Jinja, or to include generic stanzas in any output so adding macros won't mean repeatedly including them.
+
+* It'd be good to generate a dependency tree and only recompile things based on changes, like makefile-like behavior.
+
+* Fragments which would be blobs of mechanics like rss feed, thumbnail links, etc. They would be virtual files and other changes to processing
+  chains and project contents. `python -mheckweasel --fragment=rss,config=foo.meta` etc.
+  
+* Run commands as part of processing chains
+
+* Project level processing chain overrides in the .meta or whatever.
+
+
--- a/demo/.meta
+++ b/demo/.meta
@ -1,7 +0,0 @@
-{
-"site_root":"https://example.com",
-"title":"Test Metadata",
-"author": "Test User",
-"author_email": "test_user@example.com",
-"uuid_oid_root": "pixywerk-demo"
-}
--- a/demo/atom.xml
+++ b/demo/atom.xml
@ -1,33 +0,0 @@
-<?xml version="1.0" encoding="utf-8"?>
-
-<feed xmlns="http://www.w3.org/2005/Atom">
-
-	<title>{{ metadata.title }}</title>
-	<subtitle>{{ metadata.subtitle }}</subtitle>
-	<link href="{{ metadata.site_root }}/{{ metadata.file_name }}" rel="self" />
-	<link href="{{ metadata.site_root }}" />
-	<id>urn:uuid:{{ metadata.uuid }}</id>
-	<updated>{{ get_time_iso8601(metadata['build-time']) }}</updated>
-
-	{% set posts = get_file_list('blog_posts/*.cont') %}
-	{% for post in posts %}
-	{% set post_meta = get_file_metadata(post['file_path']) %}
-	<entry>
-		<title>{{ post_meta.title }}</title>
-		<link href="{{ metadata.site_root }}/{{post_meta.file_path}}" />
-		<id>urn:uuid:{{ post_meta.uuid }}</id>
-		<updated>{{ get_time_iso8601(post_meta.stat.mtime) }}</updated>
-		<summary>{{post_meta.summary }}</summary>
-		<!-- this would be the snippet, more than summary chunk -->
-		<!-- <content type="xhtml"> -->
-		<!-- 	<div xmlns="http://www.w3.org/1999/xhtml"> -->
-		<!-- 		<p>{{ post_meta.summary }}</p> -->
-		<!-- 	</div> -->
-		<!-- </content> -->
-		<author>
-			<name>{{ post_meta.author }}</name>
-			<email>{{ post_meta.author_email }}</email>
-		</author>
-	</entry>
-	{% endfor %}
-</feed>
--- a/demo/atom.xml.meta
+++ b/demo/atom.xml.meta
@ -1,5 +0,0 @@
-{
-"type": "templatable",
-"title": "Test RSS Feed",
-"subtitle": "Some Subtitle"
-}
--- a/demo/blog_posts/anotherpost.cont
+++ b/demo/blog_posts/anotherpost.cont
@ -1,5 +0,0 @@
-Some more post
-
-
-la la la
-
--- a/demo/blog_posts/anotherpost.cont.meta
+++ b/demo/blog_posts/anotherpost.cont.meta
@ -1,4 +0,0 @@
-{
-"title":"Another Post(tm)",
-"summary":"Yet another post"
-}
--- a/demo/blog_posts/test.cont
+++ b/demo/blog_posts/test.cont
@ -1 +0,0 @@
-Some content.
--- a/demo/blog_posts/test.cont.meta
+++ b/demo/blog_posts/test.cont.meta
@ -1,4 +0,0 @@
-{
-"title":"Test.cont",
-"summary":"Some empty test content"
-}
--- a/demo/foo.cont
+++ b/demo/foo.cont
@ -1 +0,0 @@
-yo fresh
--- a/demo/foo.cont.meta
+++ b/demo/foo.cont.meta
@ -1,5 +0,0 @@
-{
-"foo":"bar",
-"title":"A title",
-"summary":"Just a post."
-}
--- a/demo/index.cont
+++ b/demo/index.cont
@ -1,19 +0,0 @@
-<h1>Index of all content</h1>
-{% for f in get_file_list('*', sort_order='file_name') %}
-<a href="{{ get_file_name(f['file_name']) }}">{{get_file_name(f['file_name'])}}</a>
-{% endfor %}
-
-<p>Including foo.cont.meta:
-<pre>
-{{ get_file_content('foo.cont.meta') }}
-</pre>
-</p>
-
-<h1>Metadata</h1>
-<table class="metadata">
-<tr><th>key</th><th>value</th></tr>
-{% set metadata = get_file_metadata('foo.cont') %}
-{% for k in metadata.keys() %}
-<tr><td>{{k}}</td><td>{{metadata[k]}}</td></tr>
-{% endfor %}
-</table>
--- a/demo/mapping.json
+++ b/demo/mapping.json
--- a/demo/passthrough.md
+++ b/demo/passthrough.md
@ -1,9 +0,0 @@
-# README #
-
-This is a test of the emergency compiled HTML system. This is only a *test*.
-
-[Foo!](foo.html)
-
-{% for i in range(100) %}
-* {{ i }}
-{% endfor %}
--- a/demo/passthrough.md.meta
+++ b/demo/passthrough.md.meta
@ -1,3 +0,0 @@
-{
-"pragma":["no-proc"]
-}
--- a/demo/readme.md
+++ b/demo/readme.md
@ -1,9 +0,0 @@
-# README #
-
-This is a test of the emergency compiled HTML system. This is only a *test*.
-
-[Foo!](foo.html)
-
-{% for i in range(100) %}
-* {{ i }}
-{% endfor %}
--- a/demo/readme.md.meta
+++ b/demo/readme.md.meta
@ -1,3 +0,0 @@
-{
-"title":"Yo, markdown"
-}
--- a/demo/templates/debug.jinja2
+++ b/demo/templates/debug.jinja2
@ -1,32 +0,0 @@
-<!DOCTYPE html>
-<head>
-<title>Debug for {{path}}</title>
-<style type="text/css">
-table { border: 1px solid black; }
-div { border: 1px solid black; }
-td { border: 1px solid black; }
-</style>
-</head>
-<body>
-<p>{{path}}</p>
-<h1>Content</h1>
-<div class="content">
-{{content}}
-</div>
-
-<h1>Environment</h1>
-<table class="environment">
-<tr><th>key</th><th>value</th></tr>
-{% for k in environ.keys() %}
-<tr><td>{{k}}</td><td>{{environ[k]}}</td></tr>
-{% endfor %}
-</table>
-
-<h1>Metadata</h1>
-<table class="metadata">
-<tr><th>key</th><th>value</th></tr>
-{% for k in metadata.keys() %}
-<tr><td>{{k}}</td><td>{{metadata[k]}}</td></tr>
-{% endfor %}
-</table>
-</body>
--- a/demo/templates/default-fs.jinja2
+++ b/demo/templates/default-fs.jinja2
@ -1,6 +0,0 @@
-<table class="werk-file-list">
-<tr class="werk-file-list-head"><th>file</th><th>type</th><th>size</th><th>last change</th></tr>
-{% for f in files.keys() %}
-<tr class="werk-file-list-item"><td><a href="/{{files[f].relpath}}">{{f}}</a></td><td>{{files[f].type}}</td><td>{{files[f].size}}</td><td>{{files[f].ctime | date}}</td></tr>
-{% endfor %}
-</table>
--- a/demo/templates/default.jinja2
+++ b/demo/templates/default.jinja2
@ -1,13 +0,0 @@
-<!DOCTYPE html>
-<head>
-<title>{{metadata.title}}</title>
-<style type="text/css">
-table { border: 1px solid black; }
-div { border: 1px solid black; }
-td { border: 1px solid black; }
-</style>
-</head>
-<body>
-{{content}}
-</body>
-</html>
--- a/docs/index.md
+++ b/docs/index.md
@ -0,0 +1,151 @@
+# HECKWEASEL documentation!
+
+Welcome to the index for HECKWEASEL Documentation. In this directory you'll find a bunch of files but this is the introduction you need to understanding the way heckweasel works and how to use it. You wouldn't web a site.
+
+## Introduction: TL;DR
+
+Heckweasel compiles a set of files into a website.
+
+There is your website **template**, separate files that are the **content** of your website (such as a blog post, an image, etc), and json files that are the **metadata** for each content file. These get compiled together into static web pages.
+
+There's a lot more to it, and it is entirely programmable, but basically that's it.
+
+
+## Introduction: What the hyeck is Heckweasel?!
+
+Heckweasel is a website compiler framework. Primarily it allows the creation of web site using a collection of flat files which are in a maintainable form, producing the less maintainable formats that web browsers use.
+
+The flat files in a heckweasel project are just a directory of files like any other. There is a default directory structure for projects but that isn't important right now.
+
+Heckweasel projects generally take the form of a collection of one or more templates and a collection of one or more files that are filled into the templates. Pervasively, heckweasel draws a distinction between the contents of a web page and the template it gets put into. You can think of the template, as generally used by heckweasel, as a sort of picture frame into which your content is placed. The content itself may be implemented as one of several popular formats such as Markdown and HTML. Also of note is that there are sort of two routes from heckweeasel input to heckweasel output, one route is through the template system and the other route merely copies the input to the output.
+
+Another important detail about heckweasel is metadata. Every item in the heckweasel project (thus, every file in the heckweasel project directory) has a collection of *metadata* associated with it, such as its file name, creation time, and other objective information, but also any arbitrary information about it such as its title, a short description, thumbnails or whatever. It's also important to note that the **content** of a file counts as metadata, and is stored the same way inside of heckweasel's way of looking at the files. Metadata is stored with the file as *filename*.meta and directories contain metadata in the file called .meta. Metadata is also inherited! So setting a template in a directory's metadata will apply to all of the contents of that directory. Metadata is all in a JSON format called JStyleSon, which is JSON except you can have comments in it. All of these metadata are accessable from the templates, which leads to...
+
+The final important detail about heckweasel is that it, at is core, uses a programmable template system called Jinja. Jinja allows a lot, and I mean a *lot* of flexability in the way that the output is produced, giving complete programmability. This allows templates (and pages, for that matter) to contain programmable outcomes such as showing a list of all blog entries (each of which would be a separate file), or making a thumbnail gallery from a collection of pictures, or generating an RSS feed from all of the contents of the site. This also allows the website design to be broken into parts such that commonly-used patterns can be merely included in the file rather than being written repeatedly (although normally this function done with the page templates).
+
+
+## Glossary
+
+- **template**
+    - A -link-Jinja2 file which gets filled in with your content
+- **content**
+    - The content which gets filled into templates to produce pages
+- **metadata**
+    - Extra variables or values associated with content, which can be used to modify the way template works and do other tricks
+
+
+
+## Just the very Basic Heckweasel Project
+
+So with all of that said, the most basic possible heckweasel project that is actually functional would be something like a page template, and a content file called index. Heckweasel operates on an input directory and outputs to an output directory. This is admittedly not a normal use case since it doesn't benifit much from the elaborate system underneath, but it gets the idea across.
+
+So you have your project directory `mywebsite`; inside we can have the directories `source` and `publish`, and various files, and well here's a picture:
+
+- __mywebsite__
+    - __source__
+        - *.meta*
+        - __templates__
+            - *default.jinja*
+        - *index.md*
+        - *index.md.meta*
+    - __publish__
+  
+  
+To explain the various files:
+
+
+### *.meta*
+
+This file is a JSON file containing project-wide metadata. Usually this would be metadata that applies, by default, to all files. Some things that affect the way Heckweasel processes files would be `template` which would set the default template to put content into and `templates` which would set the directory to look for templates in. By custom we also may want to set the title, author and other things like that which we may want to fill into the output files. We also put things like the eventual published address for the site (`site_root`).
+
+Example .meta file:
+
+
+```json
+{
+    "site_root": "https://website.me",
+    "author": "Very Nice Person",
+    "title": "My Website"
+}
+
+```
+
+### *default.jinja*
+
+This is the default template. Heckweasel will look for `templates/default.jinja` unless another templates directory and template are specified. Jinja templates might output any kind of text file you want, but usually we put HTML inside them. Here's an example `default.jinja` that makes a barely functional web page but we'll explain more later:
+
+```jinja2
+<!DOCTYPE html>
+<html>
+<head>
+<title>{{ metadata.title }}</title>
+</head>
+<body>
+{{content}}
+</body>
+</html>
+```
+
+The main thing to notice is that this is a very simple HTML file. It does the bare minimum to render in a browser. The next thing to notice are all of the `{}` things. Those are Jinja commands. A `{{}}` containing a name will fill that name from the variables set in the Jinja environment. In Heckweasel the main two things are `content` and `metadata`. `Metadata` contains the metadata set via the `.meta` and other sources as discussed above. The new thing here is `content`, which is the *contents* of the page! As discussed above, the contents and template are considered separately, and so the page contents are filled into the template where the `{{content}}` tag is! You can also see that the title of the page is set based on the page's `title` metadata. We'll discuss this more in the next section.
+
+Another interesting thing is, any styling that should be applied to the whole website, to a particular page type, or whatever goes in templates. For example this is where you'd include the site-wide CSS sheet for this site, and it would apply that style to all the pages (we'll discuss this more in a future section).
+
+
+### *index.md*
+
+This is the contents of the page that will eventually become `index.html` when heckweasel is done with it. Notice it is `.md` which means markdown, a user-friendly markup format - heckweasel will convert this to an HTML fragment and fill in the template's `content` with the result, producing `index.html`. This is how the magic happens! The contents of this file could be something as simple as:
+
+```markdown
+# Welcome!
+
+Hello this is my website! Hi!
+```
+
+### *index.md.meta*
+
+This contains the metadata specific to `index.md`. It can be left out if there isn't any specific metadata, but it's useful to make even an empty one for future reference. An example use of this is to set different title for each page.
+
+Example:
+
+```json
+{
+  "title": "Welcome to my Home Page"
+}
+```
+
+### Rolling it all together
+
+Given the above tree, from the command line in the `mywebsite` directory, to compile this would be as simple as :
+
+```bash
+$ python -mheckweasel source publish
+```
+
+This would produce, in the `publish` directory, `index.html`, which would have contents like:
+
+```html
+<!DOCTYPE html>
+<html>
+<head>
+<title>Welcome to my Home Page</title>
+</head>
+<body>
+<h1>Welcome!</h1>
+<p>This is my website! Hi!</p>
+</body>
+</html>
+```
+
+Notice how the result of converting `index.md` into HTML is inserted into the template where `{{content}}` was, and the value of `title` from `index.md.meta` is inserted where `{{metadata.title}}`. While `index.md` inherited the top-level `title` metadata from the top `.meta`, its own `index.md.meta` file overrode it. Neat!
+
+The `publish` directory is ready to be serverd by a small HTTP server, placed in a web content directory, or whatever. We'll discuss that in a future section about hosting your Heckweasel site.
+
+## Getting (very slightly) more advanced with Heckweasel
+
+Now that we see how a project and its parts fit together we can make our little website slightly more interesting. 
+
+### Styling your Web Site
+
+As we alluded to above, the templates are where style information generally lives. 
+
+
--- a/READMES/METADATA.md
+++ b/READMES/METADATA.md
@ -15,6 +15,8 @@ On-disk meatdata is stored as a file along side the non-metadata file with the e

 All files define the following keys by default:

+relpath
+:  The relative path to the root of the site, useful for prepending to image `src=` and other resource paths such as CSS files and fonts in order to maintain locally viewable output.
 file_name
 :  The local path of the file
 file_path
@ -60,6 +62,14 @@ author_email
 site_root
 :  The full URL for the root of this web site used for links and whatnot, with ending slash.

+Special Keys that can be defined, these change the processing in predictable ways:
+
+type
+: Define that the file that this metadata is applied to as a specific type from the type mapping table. Useful values are `passthrough` and `templatable` with obvious outcomes.
+wildcard_metadata
+: Define a dictionary of file globs (patterns which match files such as `*.txt`), with the value being a dictionary of additional metadata to apply to the matched files. This is generally
+defined at the top level of the project to make certain file patterns treated as special without having to give them their own metadata.
+

 ## CACHING STRATEGY ##

--- a/docs/patterns.md
+++ b/docs/patterns.md
@ -0,0 +1,5 @@
+# Patterns for Site Design #
+
+These are some simple patterns for things commonly needed in websites of various kinds.
+
+##
--- a/READMES/project-layout.md
+++ b/READMES/project-layout.md
@ -1,11 +1,13 @@
 # Project Layout #

-It is recommended that in general your project for PixyWerk2 site be layed out like:
+It is recommended that in general your project for Heckweasel site be layed out like:
 ```
 project_top/
   Makefile              - Convenient for building your site
   src/                  - All "source" pages are contained in here.
     .meta               - Top-level default metadata is set here
+	 index.cont          - The content part of the index page
+	 index.cont.meta     - A metadata json file for the index, specifically.
     templates/          - Templates go in here
       default.jinja2    - Default template that will be used if none are specified
   publish/              - The path the build process will create, where the post-processed files go.
@ -19,7 +21,7 @@ site. Something as simple as:

 ```
 build: src/templates/* src/*
-	python -mpixywerk2 src publish
+	python -mheckweasel src publish
 ```

 ## src/ ##
@ -66,4 +68,4 @@ A simple default.jinja2 example:

 ## publish/ ##

-This is arbitrary, and will be created by pixywerk at build time, but it will be the root path that should be published to your web server.
+This is arbitrary, and will be created by heckweasel at build time, but it will be the root path that should be published to your web server.
--- a/docs/templatefunctions.md
+++ b/docs/templatefunctions.md
@ -0,0 +1,113 @@
+# Template Functions #
+
+These are functions exposed to the templates which perform various useful actions for the site designer.
+
+## get_file_list ##
+
+Return a list of file names based on a wildcard glob, matched against the root of the project.
+
+Prototype: `get_file_list(file_glob, sort_order, reverse, limit) -> [files]`
+
+Arguments:
+* file_glob: A standard file glob, for example `*.txt` matches all files that end in `.txt` in the root of the project. (default: `*`)
+* sort_order: A string of either `file_path`, `file_name`, `ctime`, `mtime`, `size` and `ext` (default: `ctime`)
+* reverse: whether the sort is reversed (default: False)
+* limit: The number of entries to return from the top of the list, 0 for unlimited (default: `0`)
+
+Returns:
+* A list of file names.
+
+## get_file_name ##
+
+Return the filename that will result from processing the specified file based on the processors that it will be passed through.
+
+Prototype: `get_file_name(file) -> outfile`
+
+Arguments:
+* file: The name of a file, with path, from root.
+
+Returns:
+* outfile: The name of the file, with path, that will result from processing.
+
+## get_file_content ##
+
+Return the rendered content of specified file. Caution: Can result in infinite loops if two templates include each other.
+
+Prototype: `get_file_content(file) -> content`
+
+Arguments:
+* file: The name of the input file, with path, from root.
+
+Returns:
+* content: the contents that result from passing the specified file through its processors.
+
+## get_raw ##
+
+Return the raw contents of a source file. It is specifically not passed through any processing.
+
+Prototype: `get_raw(file) -> content`
+
+Arguments:
+* file: The name of the input file, with path, from root.
+
+Returns:
+* content: the raw contents of the input file
+
+## get_file_metadata ##
+
+Return the metadata tree associated with a particular file.
+
+Prototype: `get_file_metadata(file) -> metadata`
+
+Arguments:
+* file: the name of an input file, with path, from root
+
+Returns:
+* metadata: A dictionary of metadata loaded from the file tree.
+
+## get_time_iso8601 ##
+
+Return the date/time stamp in ISO 8601 format for a given time_t timestamp for UTC.
+
+Prototype: `get_time_iso8601(timestamp) -> timestamp`
+
+Arguments:
+* timestamp: A time_t integer or float, in seconds since Jan 1 1970.
+
+Returns:
+* timestamp: A string in ISO8601 format of the date and timestamp, in the UTC timezone.
+
+## get_date_iso8601 ##
+
+Return the date stamp in ISO 8601 format for a given time_t timestamp for UTC.
+
+Prototype: `get_date_iso8601(timestamp) -> timestamp`
+
+Arguments:
+* timestamp: A time_t integer or float, in seconds since Jan 1 1970.
+
+Returns:
+* timestamp: A string in ISO8601 format of the date stamp, in the UTC timezone.
+
+## pygments_get_css ##
+
+Return a blob of CSS produced from Pygments for a given `style`.
+
+Prototype: `pygments_get_css(style) -> css`
+
+Arguments:
+* style (optional): A style identifier for the Pygments' HTMLFormatter.
+
+Returns:
+* css: A string of styles as returned by Pygments' HTMLFormatter.
+
+## pygments_markup_contents_html ##
+
+Format a code fragment with Pygments
+
+Prototype: `pygments_markup_contents_html(input, filetype, style) -> html`
+
+Arguments:
+* input: A string containing the code to format (either literal, or imported with get_raw()).
+* filetype: A string describing which lexer to use.
+* style (optional) A style identifier for Pygments' HTMLFormatter.
--- a/examples/pixywerk.com/Makefile
+++ b/examples/pixywerk.com/Makefile
@ -1,2 +0,0 @@
-build: src/templates/* src/* src/images/* src/posts/*
-	python -mpixywerk2 src publish
--- a/examples/pixywerk.com/README.md
+++ b/examples/pixywerk.com/README.md
@ -1,4 +0,0 @@
-# Pixywerk.com Example #
-
-This is an example blog system with the features most blogs would have (posts, tag cloud, atom/rss feeds,
-index with images).
--- a/examples/pixywerk.com/foo
+++ b/examples/pixywerk.com/foo
--- a/examples/pixywerk.com/publish/css/main.css
+++ b/examples/pixywerk.com/publish/css/main.css
@ -1,3 +0,0 @@
-
-
-body { margin: 10% 10% 0 10% }
--- a/examples/pixywerk.com/publish/images/20190415-0.jpg
+++ b/examples/pixywerk.com/publish/images/20190415-0.jpg
--- a/examples/pixywerk.com/publish/images/placeholder
+++ b/examples/pixywerk.com/publish/images/placeholder
--- a/examples/pixywerk.com/publish/index.html
+++ b/examples/pixywerk.com/publish/index.html
@ -1,13 +0,0 @@
-<html>
-  <head>
-    <title></title>
-    <link rel="stylesheet" type="text/css" href="css/main.css">
-  </head>
-  <body>
-    <p>This is my index!!</p>
-
-for i in posts[:5]:
-
-get metadata, fill in post image/text summary with link
-  </body>
-</html>
--- a/examples/pixywerk.com/publish/posts/2019-04-15.html
+++ b/examples/pixywerk.com/publish/posts/2019-04-15.html
@ -1,18 +0,0 @@
-<html>
-  <head>
-    <title>My first post</title>
-    <link rel="stylesheet" type="text/css" href="css/main.css">
-  </head>
-  <body>
-    <img src="../images/20190415-0.jpg" class="featured">
-<div class="byline">
-  <p>Author: Cas Rusnov</p>
-  <p>Published: 2019-04-16T01:42:27.156392+00:00
-    
-  </p>
-</div>
-<p>This is an example post!</p>
-<p>yo fresh</p>
-<p>There are many posts like it but this one is mine.</p>
-  </body>
-</html>
--- a/examples/pixywerk.com/publish/templates/default.jinja2
+++ b/examples/pixywerk.com/publish/templates/default.jinja2
@ -1,9 +0,0 @@
-<html>
-  <head>
-    <title>{{ metadata.title }}</title>
-    <link rel="stylesheet" type="text/css" href="css/main.css">
-  </head>
-  <body>
-    {{ content }}
-  </body>
-</html>
--- a/examples/pixywerk.com/src/.meta
+++ b/examples/pixywerk.com/src/.meta
@ -1,6 +0,0 @@
-{
-"author": "Cas Rusnov",
-"author_email": "rusnovn@gmail.com",
-"uuid-oid-root": "pixywerk.com/",
-"site_root": "https://pixywerk.com/"
-}
--- a/examples/pixywerk.com/src/css/main.css
+++ b/examples/pixywerk.com/src/css/main.css
@ -1,3 +0,0 @@
-
-
-body { margin: 10% 10% 0 10% }
--- a/examples/pixywerk.com/src/images/20190415-0.jpg
+++ b/examples/pixywerk.com/src/images/20190415-0.jpg
--- a/examples/pixywerk.com/src/images/placeholder
+++ b/examples/pixywerk.com/src/images/placeholder
--- a/examples/pixywerk.com/src/index.thtml
+++ b/examples/pixywerk.com/src/index.thtml
@ -1,5 +0,0 @@
-<p>This is my index!!</p>
-
-for i in posts[:5]:
-
-get metadata, fill in post image/text summary with link
--- a/examples/pixywerk.com/src/posts/2019-04-15.thtml
+++ b/examples/pixywerk.com/src/posts/2019-04-15.thtml
@ -1,12 +0,0 @@
-<img src="{{ metadata.featured }}" class="featured">
-<div class="byline">
-  <p>Author: {{ metadata.author }}</p>
-  <p>Published: {{ get_time_iso8601(metadata.stat.ctime) }}
-    {% if metadata.stat.mtime-metadata.stat.ctime > 512 %}
-    Updated: {{ get_time_iso8601(metadata.stat.mtime) }}
-    {% endif %}
-  </p>
-</div>
-<p>This is an example post!</p>
-<p>yo fresh</p>
-<p>There are many posts like it but this one is mine.</p>
--- a/examples/pixywerk.com/src/posts/2019-04-15.thtml.meta
+++ b/examples/pixywerk.com/src/posts/2019-04-15.thtml.meta
@ -1,4 +0,0 @@
-{
-"title":"My first post",
-"featured":"../images/20190415-0.jpg"
-}
--- a/examples/pixywerk.com/src/templates/default.jinja2
+++ b/examples/pixywerk.com/src/templates/default.jinja2
@ -1,9 +0,0 @@
-<html>
-  <head>
-    <title>{{ metadata.title }}</title>
-    <link rel="stylesheet" type="text/css" href="css/main.css">
-  </head>
-  <body>
-    {{ content }}
-  </body>
-</html>
--- a/heckweasel/init.py
+++ b/heckweasel/init.py
@ -0,0 +1 @@
+__version__ = '0.7.0'
--- a/heckweasel/main.py
+++ b/heckweasel/main.py
@ -11,14 +11,24 @@ import os
 import shutil
 import sys
 import time
-
 from typing import Dict, List, cast

+from .metadata import MetaTree
 from .processchain import ProcessorChains
 from .processors.processors import PassthroughException
-from .metadata import MetaTree
-from .template_tools import file_list, file_name, file_content, file_metadata, time_iso8601
-
+from .pygments import pygments_get_css, pygments_markup_contents_html
+from .template_tools import (
+    date_iso8601,
+    file_content,
+    file_list,
+    file_list_hier,
+    file_json,
+    file_metadata,
+    file_name,
+    file_raw,
+    time_iso8601,
+)
+from .utils import deep_merge_dicts

 logger = logging.getLogger()

@ -27,23 +37,30 @@ def setup_logging(verbose: bool = False) -> None:
    pass


-def get_args(args: List[str]) -> argparse.Namespace:
-    parser = argparse.ArgumentParser("Compile a Pixywerk directory into an output directory.")
+def parse_var(varspec: str) -> List:
+    if (not ('=' in varspec)):
+        return [varspec, True]
+    return list(varspec.split('=', 2))

-    parser.add_argument("root", help="The root of the pixywerk directory to process.")
+
+def get_args(args: List[str]) -> argparse.Namespace:
+    parser = argparse.ArgumentParser("Compile a Heckweasel directory into an output directory.")
+
+    parser.add_argument("root", help="The root of the heckweasel directory to process.")
    parser.add_argument("output", help="The output directory to export post-compiled files to.")

    parser.add_argument(
        "-c", "--clean", help="Remove the target tree before proceeding (by renaming to .bak).", action="store_true"
    )
    parser.add_argument("-s", "--safe", help="Abort if the target directory already exists.", action="store_true")
+    parser.add_argument("-f", "--follow-links", help="Follow symbolic links in the input tree.", action="store_true")
    parser.add_argument("-t", "--template", help="The template directory (default: root/templates)", default=None)
    parser.add_argument("-d", "--dry-run", help="Perform a dry-run.", action="store_true")
    parser.add_argument("-v", "--verbose", help="Output verbosely.", action="store_true")
    parser.add_argument("--processors", help="Specify a path to a processor configuration file.", default=None)
-
+    parser.add_argument(
+        "-D", "--define", help="Add a variable to the metadata.", nargs="+", action="extend", type=parse_var)
    result = parser.parse_args(args)
-
    # validate arguments
    if not os.path.isdir(result.root):
        raise FileNotFoundError("can't find root folder {}".format(result.root))
@ -75,25 +92,37 @@ def main() -> int:
        "dir-template": "default-dir.jinja2",
        "filters": {},
        "build-time": time.time(),
-        "uuid-oid-root": "pixywerk",
+        "uuid-oid-root": "heckweasel",
        "summary": "",
        "description": "",
        "author": "",
-        "author_email": ""
+        "author_email": "",
    }
+    if args.define:
+        for var in args.define:
+            default_metadata[var[0]] = var[1]
    meta_tree = MetaTree(args.root, default_metadata)
    file_list_cache = cast(Dict, {})
    file_cont_cache = cast(Dict, {})
    file_name_cache = cast(Dict, {})
+    file_raw_cache = cast(Dict, {})
+    flist = file_list(args.root, file_list_cache)
    default_metadata["globals"] = {
-        "get_file_list": file_list(args.root, file_list_cache),
+        "get_file_list": flist,
+        "get_hier": file_list_hier(args.root, flist),
        "get_file_name": file_name(args.root, meta_tree, process_chains, file_name_cache),
        "get_file_content": file_content(args.root, meta_tree, process_chains, file_cont_cache),
+        "get_json": file_json(args.root),
+        "get_raw": file_raw(args.root, file_raw_cache),
        "get_file_metadata": file_metadata(meta_tree),
        "get_time_iso8601": time_iso8601("UTC"),
+        "get_date_iso8601": date_iso8601("UTC"),
+        "pygments_get_css": pygments_get_css,
+        "pygments_markup_contents_html": pygments_markup_contents_html,
+        "merge_dicts": deep_merge_dicts,
    }

-    for root, _, files in os.walk(args.root):
+    for root, _, files in os.walk(args.root, followlinks=args.follow_links):
        workroot = os.path.relpath(root, args.root)
        if workroot == ".":
            workroot = ""
@ -112,7 +141,7 @@ def main() -> int:
                continue
            metadata = meta_tree.get_metadata(os.path.join(workroot, f))
            chain = process_chains.get_chain_for_filename(os.path.join(root, f), ctx=metadata)
-            print("process {} -> {}".format(os.path.join(root, f), os.path.join(target_dir, chain.output_filename)))
+            print("process {} -> {} -> {}".format(os.path.join(root, f), repr(chain), os.path.join(target_dir, chain.output_filename)))
            if not args.dry_run:
                try:
                    with open(os.path.join(target_dir, chain.output_filename), "w") as outfile:
--- a/heckweasel/defaults/chains.yaml
+++ b/heckweasel/defaults/chains.yaml
@ -8,7 +8,14 @@ default:
 templatable:
    extension: null
    chain:
-        - jinja2
+      - jinja2
+
+# Any object that needs jinja and to be embedded in a parent template
+tembed:
+  extension: null
+  chain:
+    - jinja2
+    - jinja2_page_embed

 # Markdown, BBCode and RST are first run through the templater, and then
 # they are processed into HTML, and finally embedded in a page template.
@ -62,24 +69,24 @@ template-html:
        - jinja2
        - jinja2_page_embed

-# Smart CSS are simply converted to CSS.
-sass:
-    extension:
-        - sass
-        - scss
-    chain:
-        - process_sass
-less:
-    extension:
-        - less
-    chain:
-        - process_less
+# # Smart CSS are simply converted to CSS.
+# sass:
+#     extension:
+#         - sass
+#         - scss
+#     chain:
+#         - process_sass
+# less:
+#     extension:
+#         - less
+#     chain:
+#         - process_less

-stylus:
-    extension:
-        - styl
-    chain:
-        - process_styl
+# stylus:
+#     extension:
+#         - styl
+#     chain:
+#         - process_styl

 # # Images are processed into thumbnails and sized in addition to being retained as their original
 # FIXME implement split chain processor, implement processor arguments,
--- a/heckweasel/metadata.py
+++ b/heckweasel/metadata.py
@ -1,11 +1,11 @@
 """Constructs a tree-like object containing the metadata for a given path, and caches said metadata."""

+import fnmatch
 import logging
 import mimetypes
 import os
 import uuid
-
-from typing import Dict, Optional, Union, List, Tuple, Any, cast
+from typing import Any, Dict, List, Optional, Tuple, Union, cast

 import jstyleson

@ -93,7 +93,7 @@ class MetaTree:
        """Retrieve the metadata for a given path

        The general procedure is to iterate the tree, at each level
-m        load .meta (JSON formatted dictionary) for that level, and
+        load .meta (JSON formatted dictionary) for that level, and
        then finally load the path.meta, and merge these dictionaries
        in descendant order.

@ -108,11 +108,15 @@ m        load .meta (JSON formatted dictionary) for that level, and
        # iterate path components from root to target path
        comps = [self._root] + rel_path.split("/")
        fullpath = ""
+        ospath = os.path.join(self._root, rel_path)
        for pth in comps:
            fullpath = os.path.join(fullpath, pth)
            st = os.stat(fullpath)

-            cachekey = fullpath + ".meta"
+            if os.path.isdir(fullpath):
+                cachekey = os.path.join(fullpath, ".meta")
+            else:
+                cachekey = fullpath + ".meta"
            meta = cast(Dict, {})
            try:
                st_meta = os.stat(cachekey)
@ -126,16 +130,20 @@ m        load .meta (JSON formatted dictionary) for that level, and
                meta = jstyleson.load(open(cachekey, "r"))
                self._cache.put(cachekey, meta, st_meta.st_mtime)

+            if fullpath == ospath and "wildcard_metadata" in metablob:
+                for wild in metablob["wildcard_metadata"]:
+                    if fnmatch.fnmatch(pth, wild[0]):
+                        metablob.update(wild[1])
+
            metablob.update(meta)

        # return final dict
        metablob["dir"], metablob["file_name"] = os.path.split(rel_path)
        metablob["file_path"] = rel_path
-        metablob["uuid"] = uuid.uuid3(
-            uuid.NAMESPACE_OID, metablob["uuid-oid-root"] + os.path.join(self._root, rel_path)
-        )
+        metablob["relpath"] = os.path.relpath("/", "/" + metablob["dir"])
+        metablob["uuid"] = uuid.uuid3(uuid.NAMESPACE_OID, metablob["uuid-oid-root"] + ospath)
        metablob["os-path"], _ = os.path.split(fullpath)
-        metablob["guessed-type"] = guess_mime(os.path.join(self._root, rel_path))
+        metablob["guessed-type"] = guess_mime(ospath)
        if "mime-type" not in metablob:
            metablob["mime-type"] = metablob["guessed-type"]
        metablob["stat"] = {}
--- a/heckweasel/processchain.py
+++ b/heckweasel/processchain.py
@ -3,8 +3,7 @@
 import os
 import os.path
 import random
-
-from typing import List, Iterable, Optional, Any, Dict, Type, cast
+from typing import Any, Dict, Iterable, List, Optional, Type, cast

 import yaml

@ -91,6 +90,9 @@ class ProcessorChain:
            fname = processor.filename(fname, self._ctx)
        return fname

+    def __repr__(self) -> str:
+        return "[" + ",".join([x.__class__.__name__ for x in self._processors]) + "]"
+

 class ProcessorChains:
    """Load a configuration for processor chains, and provide ability to process the chains given a particular input
--- a/heckweasel/processors/init.py
+++ b/heckweasel/processors/init.py
--- a/heckweasel/processors/jinja2.py
+++ b/heckweasel/processors/jinja2.py
@ -1,6 +1,6 @@
 """Define a Jinja2 Processor which applies programmable templating to the input stream."""

-from typing import Iterable, Optional, Dict, cast
+from typing import Dict, Iterable, Optional, cast

 from jinja2 import Environment, FileSystemLoader

@ -22,11 +22,10 @@ class Jinja2(PassThrough):
            iterable: The post-processed output stream
        """
        ctx = cast(Dict, ctx)
-        template_env = Environment(loader=FileSystemLoader(ctx["templates"]))
+        template_env = Environment(loader=FileSystemLoader(ctx["templates"]), extensions=["jinja2.ext.do"])
        template_env.globals.update(ctx["globals"])
        template_env.filters.update(ctx["filters"])
        tmpl = template_env.from_string("".join([x for x in input_file]))
        return tmpl.render(metadata=ctx)

-
 processor = Jinja2
--- a/heckweasel/processors/jinja2_page_embed.py
+++ b/heckweasel/processors/jinja2_page_embed.py
@ -3,8 +3,7 @@
   the target template is rendered)."""

 import os
-
-from typing import Iterable, Optional, Dict, cast
+from typing import Dict, Iterable, Optional, cast

 from jinja2 import Environment, FileSystemLoader

@ -25,8 +24,7 @@ class Jinja2PageEmbed(Processor):
            str: the new name for the file

        """
-
-        return os.path.splitext(oldname)[0] + ".html"
+        return os.path.splitext(oldname)[0] + "." + self.extension(oldname, ctx)

    def mime_type(self, oldname: str, ctx: Optional[Dict] = None) -> str:
        """Return the mimetype of the post-processed file.
@ -39,7 +37,7 @@ class Jinja2PageEmbed(Processor):
            str: the new mimetype of the file after processing

        """
-        return "text/html"
+        return ctx.get("mime", "text/html")

    def process(self, input_file: Iterable, ctx: Optional[Dict] = None) -> Iterable:
        """Return an iterable object of the post-processed file.
@ -52,7 +50,7 @@ class Jinja2PageEmbed(Processor):
            iterable: The post-processed output stream
        """
        ctx = cast(Dict, ctx)
-        template_env = Environment(loader=FileSystemLoader(ctx["templates"]))
+        template_env = Environment(loader=FileSystemLoader(ctx["templates"]), extensions=["jinja2.ext.do"])
        template_env.globals.update(ctx["globals"])
        template_env.filters.update(ctx["filters"])
        tmpl = template_env.get_template(ctx["template"])
@ -70,7 +68,7 @@ class Jinja2PageEmbed(Processor):
            str: the new extension of the file after processing

        """
-        return "html"
+        return ctx.get("extension", "html")


 processor = Jinja2PageEmbed
--- a/heckweasel/processors/passthrough.py
+++ b/heckweasel/processors/passthrough.py
@ -1,10 +1,10 @@
 """Passthrough progcessor which takes input and returns it."""

 import os
+from typing import Dict, Iterable, Optional, cast

-from .processors import Processor, PassthroughException
 from ..utils import guess_mime
-from typing import Iterable, Optional, Dict, cast
+from .processors import PassthroughException, Processor


 class PassThrough(Processor):
--- a/heckweasel/processors/process_less.py
+++ b/heckweasel/processors/process_less.py
--- a/heckweasel/processors/process_md.py
+++ b/heckweasel/processors/process_md.py
@ -2,8 +2,7 @@

 import io
 import os
-
-from typing import Iterable, Optional, Dict
+from typing import Dict, Iterable, Optional

 import markdown

--- a/heckweasel/processors/process_pp.py
+++ b/heckweasel/processors/process_pp.py
--- a/heckweasel/processors/process_sass.py
+++ b/heckweasel/processors/process_sass.py
--- a/heckweasel/processors/process_styl.py
+++ b/heckweasel/processors/process_styl.py
--- a/heckweasel/processors/processors.py
+++ b/heckweasel/processors/processors.py
@ -1,6 +1,5 @@
 import abc
-
-from typing import Iterable, Optional, Dict
+from typing import Dict, Iterable, Optional


 class PassthroughException(Exception):
@ -65,3 +64,6 @@ class Processor(abc.ABC):  # pragma: no cover
        Returns:
            iterable: The post-processed output stream
        """
+
+    def repr(self) -> str:
+        return self.__class__.__name__
--- a/heckweasel/pygments.py
+++ b/heckweasel/pygments.py
@ -0,0 +1,36 @@
+"""Map Pygments into the Template API for inclusion in outputs."""
+from typing import Optional, cast
+
+import pygments
+import pygments.formatters
+import pygments.lexers
+import pygments.styles
+import pygments.util
+
+
+def pygments_markup_contents_html(input_text: str, file_type: str, style: Optional[str] = None) -> str:
+    """Format input string with Pygments and return HTML."""
+
+    if style is None:
+        style = "default"
+    pyst = pygments.styles.get_style_by_name(style)
+    formatter = pygments.formatters.get_formatter_by_name("html", style=pyst)
+    try:
+        lexer = pygments.lexers.get_lexer_for_filename(file_type)
+    except pygments.util.ClassNotFound:
+        try:
+            lexer = pygments.lexers.get_lexer_by_name(file_type)
+        except pygments.util.ClassNotFound:
+            lexer = pygments.lexers.get_lexer_by_mimetype(file_type)
+
+    return pygments.highlight(input_text, lexer, formatter)
+
+
+def pygments_get_css(style: Optional[str] = None) -> str:
+    """Return the CSS styles associated with a particular style definition."""
+
+    if style is None:
+        style = "default"
+    pyst = pygments.styles.get_style_by_name(style)
+    formatter = pygments.formatters.get_formatter_by_name("html", style=pyst)
+    return formatter.get_style_defs()
--- a/heckweasel/template_tools.py
+++ b/heckweasel/template_tools.py
@ -0,0 +1,145 @@
+import copy
+import datetime
+import glob
+import itertools
+import os
+from typing import Callable, Dict, Iterable, List, Union, cast, Tuple
+
+import jstyleson
+
+import pytz
+
+from .metadata import MetaTree
+from .processchain import ProcessorChains
+from .utils import deep_merge_dicts
+
+
+def file_list(root: str, listcache: Dict) -> Callable:
+    def get_file_list(
+            path_glob: Union[str, List[str], Tuple[str]],
+            *,
+            sort_order: str = "ctime",
+            reverse: bool = False,
+            limit: int = 0) -> Iterable:
+        stattable = cast(List, [])
+        if isinstance(path_glob, str):
+            path_glob = [path_glob]
+        for pglob in path_glob:
+            if pglob in listcache:
+                stattable.extend(listcache[pglob])
+            else:
+                for fil in glob.glob(os.path.join(root, pglob)):
+                    if os.path.isdir(fil):
+                        continue
+                    if fil.endswith(".meta") or fil.endswith("~"):
+                        continue
+                    st = os.stat(fil)
+                    stattable.append(
+                        {
+                            "file_path": os.path.relpath(fil, root),
+                            "file_name": os.path.split(fil)[-1],
+                            "mtime": st.st_mtime,
+                            "ctime": st.st_ctime,
+                            "size": st.st_size,
+                            "ext": os.path.splitext(fil)[1],
+                        }
+                    )
+                listcache[pglob] = stattable
+            ret = sorted(stattable, key=lambda x: x[sort_order], reverse=reverse)
+        if limit > 0:
+            return itertools.islice(ret, limit)
+        return ret
+
+    return get_file_list
+
+
+def file_list_hier(root: str, flist: Callable) -> Callable:
+    """Return a callable which, given a directory, will walk the directory and return the files within
+    it that match the glob passed."""
+
+    def get_file_list_hier(path: str, glob: str, *, sort_order: str = "ctime", reverse: bool = False) -> Iterable:
+        output = []
+
+        for pth in os.walk(os.path.join(root, path)):
+            output.extend(
+                flist(
+                    os.path.join(os.path.relpath(os.path.realpath(pth[0]), root), glob),
+                    sort_order=sort_order,
+                    reverse=reverse,
+                )
+            )
+
+        return output
+
+    return get_file_list_hier
+
+
+def file_name(root: str, metatree: MetaTree, processor_chains: ProcessorChains, namecache: Dict) -> Callable:
+    def get_file_name(file_name: str) -> Dict:
+        if file_name in namecache:
+            return namecache[file_name]
+        metadata = metatree.get_metadata(file_name)
+        chain = processor_chains.get_chain_for_filename(os.path.join(root, file_name), ctx=metadata)
+        namecache[file_name] = chain.output_filename
+        return namecache[file_name]
+
+    return get_file_name
+
+
+def file_raw(root: str, contcache: Dict) -> Callable:
+    def get_raw(file_name: str) -> str:
+        if file_name in contcache:
+            return contcache[file_name]
+        with open(os.path.join(root, file_name), "r", encoding="utf-8") as f:
+            return f.read()
+
+    return get_raw
+
+
+def file_json(root: str) -> Callable:
+    def get_json(file_name: str, parent: Dict = None) -> Dict:
+        outd = {}
+        if parent is not None:
+            outd = copy.deepcopy(parent)
+
+        with open(os.path.join(root, file_name), "r", encoding="utf-8") as f:
+            return deep_merge_dicts(outd, jstyleson.load(f))
+
+    return get_json
+
+
+def file_content(root: str, metatree: MetaTree, processor_chains: ProcessorChains, contcache: Dict) -> Callable:
+    def get_file_content(file_name: str) -> Iterable:
+        if file_name in contcache:
+            return contcache[file_name]
+        metadata = metatree.get_metadata(file_name)
+        chain = processor_chains.get_chain_for_filename(os.path.join(root, file_name), ctx=metadata)
+        contcache[file_name] = chain.output
+        return str(chain.output)
+
+    return get_file_content
+
+
+def file_metadata(metatree: MetaTree) -> Callable:
+    def get_file_metadata(file_name: str) -> Dict:
+        return metatree.get_metadata(file_name)
+
+    return get_file_metadata
+
+
+def time_iso8601(timezone: str) -> Callable:
+    tz = pytz.timezone(timezone)
+
+    def get_time_iso8601(time_t: Union[int, float]) -> str:
+        return datetime.datetime.fromtimestamp(time_t, tz).isoformat("T")
+
+    return get_time_iso8601
+
+
+def date_iso8601(timezone: str) -> Callable:
+    tz = pytz.timezone(timezone)
+
+    def get_date_iso8601(time_t: Union[int, float]) -> str:
+        return datetime.datetime.fromtimestamp(time_t, tz).strftime("%Y-%m-%d")
+
+    return get_date_iso8601
--- a/heckweasel/tests/unit/init.py
+++ b/heckweasel/tests/unit/init.py
--- a/heckweasel/tests/unit/test_processchain.py
+++ b/heckweasel/tests/unit/test_processchain.py
--- a/heckweasel/utils.py
+++ b/heckweasel/utils.py
@ -0,0 +1,72 @@
+from typing import Dict, Optional
+import copy
+import mimetypes
+import os
+
+
+def merge_dicts(dict_a: Dict, dict_b: Dict) -> Dict:
+    """Merge two dictionaries (shallow).
+
+    Arguments:
+        dict_a (dict): The dictionary to use as the base.
+        dict_b (dict): The dictionary to update the values with.
+
+    Returns:
+        dict: A new merged dictionary.
+
+    """
+    dict_z = dict_a.copy()
+    dict_z.update(dict_b)
+    return dict_z
+
+
+def deep_merge_dicts(dict_a: Dict, dict_b: Dict, _path=None, cpy=False) -> Dict:
+    """Merge two dictionaries (deep).
+    https://stackoverflow.com/questions/7204805/how-to-merge-dictionaries-of-dictionaries/7205107#7205107
+
+    Arguments:
+        dict_a (dict): The dictionary to use as the base.
+        dict_b (dict): The dictionary to update the values with.
+        _path (list): internal use.
+
+    Returns:
+        dict: A new merged dictionary.
+
+    """
+    if cpy:
+        dict_a = copy.deepcopy(dict_a)
+    if _path is None:
+        _path = []
+    for key in dict_b:
+        if key in dict_a:
+            if isinstance(dict_a[key], dict) and isinstance(dict_b[key], dict):
+                deep_merge_dicts(dict_a[key], dict_b[key], _path + [str(key)])
+            elif dict_a[key] == dict_b[key]:
+                pass # same leaf value
+            else:
+                dict_a[key] = copy.deepcopy(dict_b[key])
+        else:
+            dict_a[key] = dict_b[key]
+    return dict_a
+
+
+def guess_mime(path: str) -> Optional[str]:
+    """Guess the mime type for a given path.
+
+    Arguments:
+        root (str): the root path of the file tree
+        path (str): the sub-path within the file tree
+
+    Returns:
+        str: the guessed mime-type
+
+    """
+    mtypes = mimetypes.guess_type(path)
+    ftype = None
+    if os.path.isdir(path):
+        ftype = "directory"
+    elif os.access(path, os.F_OK) and mtypes[0]:
+        ftype = mtypes[0]
+    else:
+        ftype = "application/octet-stream"
+    return ftype
--- a/importwp.py
+++ b/importwp.py
@ -0,0 +1,162 @@
+"""Convert a Wordpress XML dump into to a (mostly working) heckweasel tree."""
+
+import argparse
+import datetime
+import json
+import os
+import sys
+from urllib.parse import urlparse
+from xml.etree.ElementTree import ElementTree
+
+import requests
+
+FILE_PATTERN = "{postdate}-{postname}.thtml"
+
+
+def parse_args(args):
+    parser = argparse.ArgumentParser("importwp.py")
+
+    parser.add_argument("input", help="The input file.")
+    parser.add_argument("out_dir", help="Output root directory.", default='.')
+    parser.add_argument("--fetch-attachments", help="Fetch all attachments referred to in file.", action="store_true", dest='fetch_attachments')
+    parser.add_argument("--attachment-dir", help="Subdirectory to place attachments in.", default="attachments", dest='attachment_dir')
+    parser.add_argument("--post-dir", help="Subdirectory to place posts in.", default="posts", dest='post_dir')
+    parser.add_argument("--page-dir", help="Subdirectory to place pages in.", default="", dest='page_dir')
+
+    result = parser.parse_args(args)
+    result.post_dir = os.path.join(result.out_dir, result.post_dir)
+    result.page_dir = os.path.join(result.out_dir, result.page_dir)
+    result.attachment_dir = os.path.join(result.out_dir, result.attachment_dir)
+
+    return result
+
+
+def parse_input(xmlpath):
+    tree = ElementTree()
+
+    tree_root = tree.parse(source=xmlpath)
+    posts = {}
+    attachments = {}
+    pages = {}
+
+    for node in tree_root.find("channel"):
+        if node.tag == "item":
+            post_type = node.find("{http://wordpress.org/export/1.2/}post_type")
+            if post_type is not None:
+                status = node.find("{http://wordpress.org/export/1.2/}status")
+                if status is not None and status.text == "draft":
+                    continue
+                content = node.find("{http://purl.org/rss/1.0/modules/content/}encoded")
+                title = node.find("title")
+                pubdate = node.find("pubDate")
+                description = node.find("description")
+                post_name = node.find("{http://wordpress.org/export/1.2/}post_name")
+                categories = node.findall("category")
+                post_id = node.find("{http://wordpress.org/export/1.2/}post_id")
+                post_parent = node.find("{http://wordpress.org/export/1.2/}post_parent")
+                if post_type.text == "post":
+                    # found a post!
+                    posts[post_id.text] = {'content':content,
+                                           'title':title,
+                                           'pubdate':pubdate,
+                                           'description':description,
+                                           'post_name':post_name,
+                                           'categories':categories,
+                                           'post_parent':post_parent}
+                elif post_type.text == "attachment":
+                    # attachment
+                    att_url = node.find("{http://wordpress.org/export/1.2/}attachment_url")
+
+                    attachments[post_id.text] = {'content':content,
+                                                 'title':title,
+                                                 'pubdate':pubdate,
+                                                 'description':description,
+                                                 'post_name':post_name,
+                                                 'categories':categories,
+                                                 'post_parent':post_parent,
+                                                 'att_url':att_url,}
+                elif post_type.text == "page":
+                    pages[post_id.text] = {'content':content,
+                                           'title':title,
+                                           'pubdate':pubdate,
+                                           'description':description,
+                                           'post_name':post_name,
+                                           'categories':categories,
+                                           'post_parent':post_parent}
+
+    return posts, attachments, pages
+
+def fetch_attachment(attch, outdir):
+    url = attch['att_url'].text
+    p = urlparse(url)
+    filename = os.path.join(outdir, os.path.split(p.path)[-1])
+    print("fetching attachment",url,"->",filename)
+    r = requests.get(url)
+    with open(filename, 'wb') as outf:
+        outf.write(r.content)
+
+def save_cont(post, outdir):
+    dt = datetime.datetime.strptime(post['pubdate'].text, "%a,  %d %b %Y %H:%M:%S %z")
+    postdate = dt.strftime("%Y-%m-%d-%H%M%S")
+    filename = FILE_PATTERN.format(postdate=postdate, postname=post['post_name'].text)
+    print(post['title'].text, "->", filename)
+    with open(os.path.join(outdir, filename), "w") as outf:
+        outf.write(post['content'].text)
+        # handle attachments
+
+        tags = []
+        category = ""
+        for tg in post['categories']:
+            if "domain" in tg.attrib and tg.attrib["domain"] == "category":
+                category = tg.text
+            else:
+                tags.append(tg.text)
+
+    with open(os.path.join(outdir, filename + ".meta"), "w") as outf:
+        metadata = {
+            "title": post['title'].text,
+            "description": post['description'].text,
+            "post_time": dt.timestamp(),
+            "featured": "",
+            "tags": tags,
+            "category": category,
+        }
+        json.dump(metadata, outf)
+
+
+def main():
+    args = parse_args(sys.argv[1:])
+    try:
+        os.mkdir(args.out_dir)
+    except FileExistsError:
+        pass
+
+    try:
+        os.mkdir(args.page_dir)
+    except FileExistsError:
+        pass
+
+    try:
+        os.mkdir(args.post_dir)
+    except FileExistsError:
+        pass
+
+    if args.fetch_attachments:
+        try:
+            os.mkdir(args.attachment_dir)
+        except FileExistsError:
+            pass
+
+    posts, attachments, pages = parse_input(args.input)
+
+    if args.fetch_attachments:
+        [fetch_attachment(post, args.attachment_dir) for post in attachments.values()]
+
+    [save_cont(post, args.post_dir) for post in posts.values()]
+    [save_cont(page, args.page_dir) for page in pages.values()]
+
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/pixywerk2/template_tools.py
+++ b/pixywerk2/template_tools.py
@ -1,80 +0,0 @@
-import datetime
-import glob
-import itertools
-import os
-import pytz
-from typing import Callable, Dict, List, Iterable, Union, cast
-
-from .metadata import MetaTree
-from .processchain import ProcessorChains
-
-
-def file_list(root: str, listcache: Dict) -> Callable:
-    def get_file_list(path_glob: str, *, sort_order: str = "ctime", reverse: bool = False, limit: int = 0) -> Iterable:
-        stattable = cast(List, [])
-        if path_glob in listcache:
-            stattable = listcache[path_glob]
-        else:
-            for fil in glob.glob(os.path.join(root, path_glob)):
-                if os.path.isdir(fil):
-                    continue
-                if fil.endswith(".meta") or fil.endswith("~"):
-                    continue
-                st = os.stat(fil)
-                stattable.append(
-                    {
-                        "file_path": os.path.relpath(fil, root),
-                        "file_name": os.path.split(fil)[-1],
-                        "mtime": st.st_mtime,
-                        "ctime": st.st_ctime,
-                        "size": st.st_size,
-                        "ext": os.path.splitext(fil)[1],
-                    }
-                )
-            listcache[path_glob] = stattable
-        ret = sorted(stattable, key=lambda x: x[sort_order], reverse=reverse)
-        if limit > 0:
-            return itertools.islice(ret, limit)
-        return ret
-
-    return get_file_list
-
-
-def file_name(root: str, metatree: MetaTree, processor_chains: ProcessorChains, namecache: Dict) -> Callable:
-    def get_file_name(file_name: str) -> Dict:
-        if file_name in namecache:
-            return namecache[file_name]
-        metadata = metatree.get_metadata(file_name)
-        chain = processor_chains.get_chain_for_filename(os.path.join(root, file_name), ctx=metadata)
-        namecache[file_name] = chain.output_filename
-        return namecache[file_name]
-
-    return get_file_name
-
-
-def file_content(root: str, metatree: MetaTree, processor_chains: ProcessorChains, contcache: Dict) -> Callable:
-    def get_file_content(file_name: str) -> Iterable:
-        if file_name in contcache:
-            return contcache[file_name]
-        metadata = metatree.get_metadata(file_name)
-        chain = processor_chains.get_chain_for_filename(os.path.join(root, file_name), ctx=metadata)
-        contcache[file_name] = chain.output
-        return chain.output
-
-    return get_file_content
-
-
-def file_metadata(metatree: MetaTree) -> Callable:
-    def get_file_metadata(file_name: str) -> Dict:
-        return metatree.get_metadata(file_name)
-
-    return get_file_metadata
-
-
-def time_iso8601(timezone: str) -> Callable:
-    tz = pytz.timezone(timezone)
-
-    def get_time_iso8601(time_t: Union[int, float]) -> str:
-        return datetime.datetime.fromtimestamp(time_t, tz).isoformat("T")
-
-    return get_time_iso8601
--- a/pixywerk2/tests/unit/init.py
+++ b/pixywerk2/tests/unit/init.py
--- a/pixywerk2/utils.py
+++ b/pixywerk2/utils.py
@ -1,42 +0,0 @@
-import mimetypes
-import os
-
-from typing import Dict, Optional
-
-
-def merge_dicts(dict_a: Dict, dict_b: Dict) -> Dict:
-    """Merge two dictionaries.
-
-    Arguments:
-        dict_a (dict): The dictionary to use as the base.
-        dict_b (dict): The dictionary to update the values with.
-
-    Returns:
-        dict: A new merged dictionary.
-
-    """
-    dict_z = dict_a.copy()
-    dict_z.update(dict_b)
-    return dict_z
-
-
-def guess_mime(path: str) -> Optional[str]:
-    """Guess the mime type for a given path.
-
-    Arguments:
-        root (str): the root path of the file tree
-        path (str): the sub-path within the file tree
-
-    Returns:
-        str: the guessed mime-type
-
-    """
-    mtypes = mimetypes.guess_type(path)
-    ftype = None
-    if os.path.isdir(path):
-        ftype = "directory"
-    elif os.access(path, os.F_OK) and mtypes[0]:
-        ftype = mtypes[0]
-    else:
-        ftype = "application/octet-stream"
-    return ftype
--- a/demo/bar/baz/quux/quuux
+++ b/demo/bar/baz/quux/quuux
--- a/setup.py
+++ b/setup.py
@ -1,9 +1,11 @@
 """Package configuration."""
 from setuptools import find_packages, setup

-LONG_DESCRIPTION = """Pixywerk 2 is a filesystem based static site generator."""
+from heckweasel import __version__

-INSTALL_REQUIRES = ["yaml-1.3", "markdown", "jstyleson", "jinja2"]
+LONG_DESCRIPTION = """Heckweasel is a filesystem based static site generator."""
+
+INSTALL_REQUIRES = ["yaml-1.3", "markdown", "jstyleson", "jinja2", "pygments"]

 # Extra dependencies
 EXTRAS_REQUIRE = {
@ -26,7 +28,7 @@ EXTRAS_REQUIRE = {
 SETUP_REQUIRES = ["pytest-runner>=2.7.1", "setuptools_scm>=1.15.0"]
 setup(
    author="Cassowary Rusnov",
-    author_email="rusnovn@gmail.com",
+    author_email="alderconestudio@gmail.com",
    classifiers=[
        "Development Status :: 1 - Pre-alpha",
        "Environment :: Console",
@ -49,11 +51,12 @@ setup(
    keywords=["cms", "website", "compiler"],
    license="MIT",
    long_description=LONG_DESCRIPTION,
-    name="pixywerk2",
+    name="heckweasel",
    packages=find_packages(exclude=["*.tests", "*.tests.*"]),
    platforms=["GNU/Linux"],
    setup_requires=SETUP_REQUIRES,
    use_scm_version=True,
-    url="https://git.antpanethon.com/cas/pixywerk2",
+    url="https://git.aldercone.studio/aldercone/heckweasel",
    zip_safe=False,
+    version=__version__,
 )
--- a/tox.ini
+++ b/tox.ini
@ -1,5 +1,5 @@
 [tox]
-envlist=py{36,37}-{code-quality, unit} #, py37-sphinx
+envlist=py{36,37,38,39}-{code-quality, unit} #, py37-sphinx
 skipsdist = true

 [testenv]
@ -7,16 +7,18 @@ setenv =
    LANG = en_US.UTF-8
 deps = .[tests]
 commands =
-	 unit: py.test --strict --cov-report=term-missing --cov=pixywerk2 pixywerk2/tests/unit {posargs}
-	 code-quality: flake8 pixywerk2
-	 code-quality: black -l 120 --check pixywerk2
+	 unit: py.test --strict --cov-report=term-missing --cov=heckweasel heckweasel/tests/unit {posargs}
+	 code-quality: flake8 heckweasel
+	 code-quality: black -l 120 --check heckweasel
 	 code-quality: - prospector -A
-	 code-quality: - mypy --ignore-missing-imports pixywerk2
+	 code-quality: - mypy --ignore-missing-imports heckweasel
 	 # sphinx: python setup.py build_sphinx -b html
   	 # sphinx: python setup.py build_sphinx -b man
 basepython =
    py36: python3.6
    py37: python3.7
+    py38: python3.8
+    py39: python3.9

 [flake8]
 max-line-length = 120
Author	SHA1	Message	Date
Cassowary Rusnov	8404f8927d	Documentation update! But the docs/index.md is still major WIP.	2023-08-05 12:09:20 -07:00
Cassowary Rusnov	f448a1f1ee	Rename to heckweasel.	2023-04-18 17:31:25 -07:00
Cassowary Rusnov	727b2b9309	General housekeeping update. - Bump version to 0.6.0 - Add template functions to merge dictionaries for loading JSON data inside them - Add extension management separate from MIME type - Make the `tembed` processor which runs a generic jinja template through embedding in the template	2023-03-06 16:22:10 -08:00
Cassowary Rusnov	357db6eca4	Major additions to support JSON files and provide compile time options - Add file_json/get_file_json handling. This creates a new global template function to treat a file as a json file and returns a dict. - Add some tools for merging dictionaries. - Add command-line settable variables that get inserted into metadata tree so that at runtime options can be set.	2021-12-19 22:02:47 -08:00
Cassowary Rusnov	4780764a60	Minor changes. Formatting changes. Add some Python version environments for testing. Extended get_file_list to allow a list of globs rather than just a single glob.	2021-06-30 00:39:50 -07:00
Cassowary Rusnov	b8bc24cf6f	Reformatted with automated tools and minor fixes.	2021-04-28 23:09:35 -07:00
Cas Rusnov	bf0b7a1cb7	Comment out smart CSS from default mapping. Fix minor bug in template_tools	2019-06-03 19:23:34 -07:00
Cas Rusnov	39dde28e35	Updates! Some documentation expansion. Add {do} support to jinja systems	2019-05-26 19:39:11 -07:00
Cas Rusnov	a0c4381c99	Major development update. * Updated LICENSE, READMES/METADATA.md and TODO.md * Added example blog to examples/ * Added preliminary Pygments support for embedding code in pages. * Add preliminary Wordpress dump importer * Expansions to template_tools and metadata to support Blog use case.	2019-05-23 17:51:21 -07:00
Cas Rusnov	81532f3462	Minor doc additions.	2019-04-17 19:47:00 -07:00