[dependencies]
chrono = "0.4.41"
-regex = "1.11.1"
+regex = "1"
serde_json = "1.0.140"
}
code {
- color: white;
- background-color: black;
+ background-color: #EEE;
padding: 1px 4px 1px 4px;
- border-radius: 4px;
+ /*border-radius: 4px;*/
}
pre {
- color: white;
- background-color: black;
+ background-color: #EEE;
padding: 4px 4px 4px 4px;
- border-radius: 4px;
+ /*border-radius: 4px;*/
+}
+
+img {
+ width: 100%;
}
## Goodnight!
```
-In HTML:
+In LaTeX:
```
-<h3 id="goodnight!">Goodnight!</h3>
+\subsection{Goodnight!}
```
-In LaTeX:
+In HTML:
```
-\subsection{Goodnight!}
+<h3 id="goodnight!">Goodnight!</h3>
```
In gemtext:
## Links
-Still to be implemented
+Links look similar to how images are implemented in markdown, but instead of an exclaimation mark, `=` is used. This can only be done at the start of a line. There are also more feilds than that of images, these mean the following:
+
+```
+=[Caption][LaTeX link][HTML link][gemtext link]
+
+or...
-## Email addresses
+=[LaTeX link][HTML link][gemtext link]
-Still to be implemented
+or...
+
+=[Caption][Universal link]
+
+or...
+
+=[Universal link]
+```
+
+This is because different formats might require different link locations. Format agnostic links are still possible for ease of writing or if the file in specific to one format.
## Paragraphs
A paragraph is any line that does not start with one of the following: `#-.` or `\`\`\`` or one of the macros such as `[TOC]`
-## Images & other external files
+## Images
-Still to be implemented
+As in conventional markdown, images are represented with square backets, preceeded by an exclaimation mark. This must begin on the first character of a newline. The image caption is captured in the first set of square brackets, with the path to the image captured in the second. In LaTeX, this renders as an image centered in a figure enviroment with the figure caption contained in the first pair of brackets. In HTML this renders as an image in a `<figure>` tag, with a `<figcaption>` containing the caption. In gemtext this is simply a link to the path provided with the caption as the link text.
+
+Example:
+
+```
+![Goodnight][images/goodnight.png]
+```
+
+In LaTeX:
+
+```
+\begin{figure}
+ \centering
+ \includegraphics[width=0.7\textwidth]{images/goodnight.png}
+ \caption{Goodnight}
+\end{figure}
+```
+
+In HTML:
+
+```
+<figure>
+ <img src='images/goodnight.png' />
+ <figcaption>Goodnight</figcaption>
+</figure>
+```
+
+In gemtext:
+
+```
+=> <images/goodnight.png> Goodnight (image)
+```
## Code
Inline code blocks can be used with the `\`` backtick character. In LaTeX this becomes `\\texttt{...}`, in html this becomes `<code>...</code>` and in gemtext it is included as written. A bar character cannot currently be used in an inline code block.
-Code blocks can be used with 3 backtick characters: `\`\`\``. This sequence must start on the first character on a new line.
+Code blocks can be used with three backtick characters: `\`\`\``. This sequence must start on the first character on a new line. This is parsed into the `<pre>` HTML tag, the `verbatim` LaTeX enviroment and the gemtext code block enviroment (also bookended by three backticks).
## Block quotes
Block quotes are started with the tab character `\t`. They can contain any other elements.
- Here is an example of a blockquote.
+ This is a paragraph within a blockquote
- - Here is a list element in a blockquote
+ - This is a list element in a blockquote
Tabbed lines with a space between them are joined into one single blockquote. If two different blockquotes are needed next to each other, a backslash at the first character of a newline can split the two up.
+This is parsed into the HTML `quote` tag, the LaTeX `quote` enviroment and the gemtext block quote enviroment (where every line is preceeeded by a `>` symbol).
+
## Inline formatting
-- Bold styling will work only with a single asterisk character.
-- Italic styling will work only with a single underscore character.
+Bold styling is done with a single `\*` character surrounding the bold text. This is parsed to `\\textbf` and `<b>` respectively with inline formatting not supported in gemtext.
+
+Italic styling is done with a single `\_` character surrounding the italicised text. This is parsed to `\\textit` and `<em>` respectively with inline formatting not supported in gemtext.
## Lists
-- Unordered lists will work only with the hyphen character.
-- Ordered lists will work only with the period character.
+In HTML and LaTeX, lists are parsed directly to list markup with no complications (`enumerate`, `itemize` and `\\item` for LaTeX and `<ol>`, `<ul>` and `<li>` for HTML). In gemtext, both type of list item is included prefixed with a `\*` character and no nesting is needed.
+
+Ordered lists will work only with the period character. They are numbered by the browser in the case of HTML, or externally by the LaTeX compiler in the case of LaTeX. Gemtext does not support ordered lists so the parser numbers the list elements in plaintext. Ordered lists can be broken up by including a blank line between two lists.
## Macros
-To generate the title, table of contents and bibliography, the following macros can be used:
+To generate the title, table of contents and bibliography, the following macros can be used at the first character of a newline:
-- `[TITLE]` generates the title block
+- `[META]` generates the title block
- `[TOC]` generates the table of contents
-- `[META]` generates the bibliography
+- `[REFS]` generates the bibliography
+
+In LaTeX, the title and bibliography are styled by the parser, rather than the native `\\maketitle` or `\\printbibliography` command. The table of contents is generated by the LaTeX compiler natively. In HTML and gemtext, the title, table of contents and references are all styled by the parser.
+
+Note that in HTML, `<hgroup>` is used for a semantic title block and the `<h1>` tag is used for the title only and in gemtext the first heading level is used for the title only.
## Metadata
. `issuetitle`: Issue title of the book or journal the work is in (optional, must be used with `journaltitle`)
. `editor`: Author of the book or journal the work is in. Should be formatted the same as "author". (optional, must be used with `journaltitle`)
. `volume`: Volume of the book or journal the work is in. Example: "7th". (optional, must be used with `journaltitle`)
+
+## Maths
+
+Math mode has yet to be implemented.
// so that we always get the right one moving left to right
let mut front_back = [back, front].into_iter().cycle();
for (index, _) in string.rmatch_indices(symbol) {
- if index == 0 || !(&text[index - 1 .. index] == "\\") {
+ if match &text.get(index.saturating_sub(1) .. index) {
+ Some(r"\") => {
+ match &text.get(index.saturating_sub(2) .. index.saturating_sub(1)) {
+ Some(r"\") => true,
+ _ => false
+ }
+ },
+ _ => true
+ } {
text.replace_range(index .. index + 1, front_back.next().unwrap());
}
}
let re = Regex::new(r"\\(?<escaped>[\d\D])").unwrap();
for capture in re.captures_iter(&string).collect::<Vec<_>>().iter().rev() {
let mat = capture.name("escaped").unwrap();
- if match mat.start() > 1 {
- true => {&string[mat.start()-2..mat.start()-1] != r"\"}
- false => {true}
- } {
- new_string.replace_range(mat.start()-1..mat.end(), mat.as_str());
- }
+ new_string.replace_range(mat.start()-1..mat.end(), mat.as_str());
}
new_string
}
pub fn parse_inline_latex (string: &str) -> String {
- let mut text = string.replace(r"\\", r"\\\");
+ let mut text = string.replace(r"\\", r"\\\\");
text = parse_inline(&text, '*', r"\\textbf{", "}");
text = parse_inline(&text, '_', r"\\textit{", "}");
text = parse_inline(&text, '`', r"\\texttt{", "}");
}
pub fn parse_inline_html (string: &str) -> String {
- let mut text = string
+ let mut text = string.to_string();
+
+ text = text
.replace(r"&", r"&")
.replace(r"<", r"<")
.replace(r">", r">");
use super::*;
#[test]
- fn test_formatting_parser() {
- let string = parse_inline("`G_ood_n*igh*t`", '*', "B", "B");
- let string = parse_inline(&string, '_', "I", "I");
- let string = parse_inline(&string, '`', "C", "C");
- assert_eq!(string, "CGIoodInBighBtC");
+ fn test_parse_inline() {
+ assert_eq!(r"`oo`_igh_\*\_\`", parse_inline(r"`oo`_i**_\*\_\`", '*', "g", "h"));
+ assert_eq!(r"`oo`night\*\_\`", parse_inline(r"`oo`_igh_\*\_\`", '_', "n", "t"));
+ assert_eq!(r"Goodnight\*\_\`", parse_inline(r"`oo`night\*\_\`", '`', "G", "d"));
+ }
+
+ #[test]
+ fn test_de_escape() {
+ assert_eq!(r"\\abc ", de_escape(r"\\\\a\b\c\ "));
+ }
+
+ #[test]
+ fn test_parse_inline_remove() {
+ assert_eq!(r"Goodnight\*\_\`", parse_inline_remove(r"*__``Goodnight\*\_\`*"));
+ }
+
+ #[test]
+ fn test_parse_inline_latex() {
+ assert_eq!(r"\textbf{Goodnight}", parse_inline_latex(r"*Goodnight*"));
+ assert_eq!(r"\textbf{\\goodnight{}}", parse_inline_latex(r"*\\goodnight{}*"));
+ assert_eq!(r"\\\textit{Goodnight \&{} Goodnight}", parse_inline_latex(r"\\_Goodnight & Goodnight_"));
+ }
+
+ #[test]
+ fn test_parse_inline_html() {
+ assert_eq!(r"<b>Goodnight</b>", parse_inline_html(r"*Goodnight*"));
+ assert_eq!(r"<b>\Goodnight</b>", parse_inline_html(r"*\\Goodnight*"));
+ assert_eq!(r"<<em>Goodnight </br> Goodnight</em>", parse_inline_html(r"\<_Goodnight </br> Goodnight_"));
+ }
+
+ #[test]
+ fn test_parse_inline_gemtext() {
+ assert_eq!(r"\Goodnight", parse_inline_gemtext(r"*\\Goodnight*"));
}
}
use crate::types::preformatted::Preformatted;
use crate::types::blockquote::BlockQuote;
use crate::types::contents::Contents;
+use crate::types::image::Image;
+use crate::types::link::Link;
+use crate::types::link::LinkType;
use crate::types::list::{Item, List};
// this gets the full text of the file from the document class (with the reference list and metadata removed)
let metadata = metadata;
let reference_list = reference_list;
+ let mut figure_counter = 1;
+
let mut lines = string.lines().peekable();
while let Some(line) = lines.next() {
if !(line == "") {
} {}
document.push(
Box::new(BlockQuote{elements: parse_body(text, metadata.clone(), reference_list.clone())}))
- }
+ },
+
+ // match images
+ ('!', '[', ..) => {
+ let re = Regex::new(r"!\[(?<caption>[\d\w (\\\[)]*)\]\[(?<path>[\d\w\.\/]*)\]").unwrap();
+ match re.captures_at(line, 0) {
+ Some(captures) => {
+ let caption = captures.name("caption").unwrap().as_str().to_string();
+ let path = captures.name("path").unwrap().as_str().to_string();
+ document.push(
+ Box::new(Image{caption, path, count: figure_counter})
+ );
+ figure_counter += 1;
+ },
+ None => ()
+ }
+ },
+
+ // match images
+ ('=', '[', ..) => {
+ let re = Regex::new(r"=(\[[\d\w \./]*\]){1,4}").unwrap();
+ match re.find_at(&line, 0) {
+ Some(mat) => {
+ let feilds: Vec<&str> = line[mat.start()+2..mat.end()-1].split("][").collect();
+ document.push(Box::new(Link{link_type: match feilds.len() { // this could be neater
+ 1 => {LinkType::Universal(feilds[0].to_string())},
+ 2 => {LinkType::UniversalCaption(
+ feilds[0].to_string(),
+ feilds[1].to_string()
+ )},
+ 3 => {LinkType::TypedUniversal(
+ feilds[0].to_string(),
+ feilds[1].to_string(),
+ feilds[2].to_string()
+ )},
+ _ => {LinkType::TypedCaptioned(
+ feilds[0].to_string(),
+ feilds[1].to_string(),
+ feilds[2].to_string(),
+ feilds[3].to_string()
+ )}
+ }}));
+ },
+ None => ()
+ }
+ },
// make everything else a paragraph
_ => {
pub mod preformatted;
pub mod blockquote;
pub mod contents;
+pub mod image;
+pub mod link;
pub trait Renderable {
fn render_latex(&self) -> String;
4 => "h5",
_ => "h6"
};
- format!("<{section} id={anchor}>{text}</{section}>\n")
+ format!("<{section} id='{anchor}'>{text}</{section}>\n")
}
fn render_gemtext (&self) -> String {
#[test]
fn test_latex_parser() {
let heading = Heading{ text: String::from("Goodnight"), level: 3};
- assert_eq!(&heading.render_latex(), "\\subsubsection{Goodnight}");
+ assert_eq!(&heading.render_latex(), "\\subsubsection{Goodnight}\n\n");
}
#[test]
fn test_html_parser() {
- let heading = Heading{ text: String::from("Goodnight"), level: 5};
- assert_eq!(&heading.render_html(), "<h5>Goodnight</h5>\n");
+ let heading = Heading{ text: String::from("Goodnight Publishing"), level: 5};
+ assert_eq!(&heading.render_html(), "<h6 id='goodnight-publishing'>Goodnight Publishing</h6>\n");
}
#[test]
fn test_gemtext_parser() {
let heading = Heading{ text: String::from("Goodnight"), level: 2};
- assert_eq!(&heading.render_gemtext(), "## Goodnight");
+ assert_eq!(&heading.render_gemtext(), "### Goodnight\n\n");
}
}
--- /dev/null
+use crate::types::Renderable;
+
+pub struct Image {
+ pub path: String,
+ pub caption: String,
+ pub count: usize
+}
+
+impl Renderable for Image {
+ fn render_latex(&self) -> String {
+ format!("\
+\\begin{{figure}}
+ \\centering
+ \\includegraphics[width=0.7/textwidth]{{{path}}}
+ \\caption{{{caption}}}
+\\end{{figure}}\n\n",
+ path = self.path,
+ caption = self.caption)
+ }
+ fn render_html(&self) -> String {
+ let anchor: String = self.caption.to_lowercase().split_whitespace().map(|s| s.chars().chain(['-'])).flatten().collect();
+ format!("\
+<figure>
+ <img src='{path}' />
+ <figcaption id='{anchor}'><em>Figure {count}.</em> {caption}</figcaption>
+</figure>\n\n",
+ path = self.path,
+ caption = self.caption,
+ count = self.count,
+ anchor = &anchor[..anchor.len()-1])
+ }
+ fn render_gemtext(&self) -> String {
+ format!("\
+=> <{path}> Figure {count} (image). {caption}
+\n",
+ path = self.path,
+ caption = self.caption,
+ count = self.count)
+ }
+ fn render_plaintext(&self) -> String {
+ format!("Figure {count} (image). {caption}",
+ caption = self.caption,
+ count = self.count)
+ }
+}
--- /dev/null
+use super::Renderable;
+
+pub enum LinkType {
+ Universal(String),
+ UniversalCaption(String, String),
+ TypedUniversal(String, String, String),
+ TypedCaptioned(String, String, String, String)
+}
+
+pub struct Link {
+ pub link_type: LinkType
+}
+
+impl Renderable for Link {
+ fn render_latex(&self) -> String {
+ match &self.link_type {
+ LinkType::Universal(link) => {
+ format!("\\url{{{link}}}\n\n")
+ },
+ LinkType::UniversalCaption(caption, link) => {
+ format!("\\href{{{caption}}}{{{link}}}\n\n")
+ },
+ LinkType::TypedUniversal(link, _, _) => {
+ format!("\\url{{{link}}}\n\n")
+ },
+ LinkType::TypedCaptioned(caption, link,_, _) => {
+ format!("\\href{{{caption}}}{{{link}}}\n\n")
+ },
+ }
+ }
+ fn render_html(&self) -> String {
+ match &self.link_type {
+ LinkType::Universal(link) => {
+ format!("<a href='{link}'>{link}</a>\n")
+ },
+ LinkType::UniversalCaption(caption, link) => {
+ format!("<a href='{link}'>{caption}</a>\n")
+ },
+ LinkType::TypedUniversal(_, link, _) => {
+ format!("<a href='{link}'>{link}</a>\n")
+ },
+ LinkType::TypedCaptioned(caption, _, link, _) => {
+ format!("<a href='{link}'>{caption}</a>\n")
+ },
+ }
+ }
+ fn render_gemtext(&self) -> String {
+ match &self.link_type {
+ LinkType::Universal(link) => {
+ format!("=> {link}\n\n")
+ },
+ LinkType::UniversalCaption(caption, link) => {
+ format!("=> {link} {caption}\n\n")
+ },
+ LinkType::TypedUniversal(_, link, _) => {
+ format!("=> {link}\n\n")
+ },
+ LinkType::TypedCaptioned(caption, _, link, _) => {
+ format!("=> {link} {caption}\n\n")
+ },
+ }
+ }
+ fn render_plaintext(&self) -> String {
+ match &self.link_type {
+ LinkType::Universal(link) => {
+ format!("\\url{{{link}}}\n\n")
+ },
+ LinkType::UniversalCaption(caption, link) => {
+ format!("\\href{{{caption}}}{{{link}}}\n\n")
+ },
+ LinkType::TypedUniversal(link, _, _) => {
+ format!("\\url{{{link}}}\n\n")
+ },
+ LinkType::TypedCaptioned(caption, link,_, _) => {
+ format!("\\href{{{caption}}}{{{link}}}\n\n")
+ },
+ }
+ }
+}
#[test]
fn test_latex_parser() {
let paragraph = Paragraph{ text: String::from("*Goodnight*")};
- assert_eq!(¶graph.render_latex(), "\\textbf{Goodnight}");
+ assert_eq!(¶graph.render_latex(), "\\textbf{Goodnight}\n\n");
}
#[test]
fn test_html_parser() {
#[test]
fn test_gemtext_parser() {
let paragraph = Paragraph{ text: String::from("`Goodnight`")};
- assert_eq!(¶graph.render_gemtext(), "Goodnight");
+ assert_eq!(¶graph.render_gemtext(), "Goodnight\n\n");
}
}