Safe Haskell | Safe-Inferred |
---|---|
Language | Haskell2010 |
This module deals with punctuations in Korean text.
Synopsis
- data ArrowTransformationOption
- transformArrow :: Set ArrowTransformationOption -> [HtmlEntity] -> [HtmlEntity]
- data CitationQuotes = CitationQuotes {
- title :: (Text, Text)
- subtitle :: (Text, Text)
- htmlElement :: Maybe (HtmlTag, HtmlRawAttrs)
- data Quotes = Quotes {}
- data QuotePair
- angleQuotes :: CitationQuotes
- cornerBrackets :: CitationQuotes
- curvedQuotes :: Quotes
- curvedSingleQuotesWithQ :: Quotes
- guillemets :: Quotes
- horizontalCornerBrackets :: Quotes
- horizontalCornerBracketsWithQ :: Quotes
- quoteCitation :: CitationQuotes -> [HtmlEntity] -> [HtmlEntity]
- transformQuote :: Quotes -> [HtmlEntity] -> [HtmlEntity]
- verticalCornerBrackets :: Quotes
- verticalCornerBracketsWithQ :: Quotes
- data Stops = Stops {}
- horizontalStops :: Stops
- horizontalStopsWithSlashes :: Stops
- normalizeStops :: Stops -> [HtmlEntity] -> [HtmlEntity]
- transformEllipsis :: [HtmlEntity] -> [HtmlEntity]
- verticalStops :: Stops
- transformEmDash :: [HtmlEntity] -> [HtmlEntity]
Arrows
data ArrowTransformationOption Source #
Substitution options for transformArrow
function. These options can
be composited as an element of a set.
[]
: Transform only leftwards and rightwards arrows.[
: Transform bi-directional arrows as well as left/rightwards arrows.LeftRight
][
: Transform double arrows as well as single arrows.DoubleArrow
][
: Transform all types of arrows.LeftRight
,DoubleArrow
]
LeftRight | A bidirect arrow (e.g., ↔︎). |
DoubleArrow | An arrow which has two lines (e.g., ⇒). |
Instances
transformArrow :: Set ArrowTransformationOption -> [HtmlEntity] -> [HtmlEntity] Source #
Transforms hyphens and less-than and greater-than inequality symbols that mimic arrows into actual arrow characters:
->
turns into→
(U+2192 RIGHTWARDS ARROW).<-
turns into←
(U+2190 LEFTWARDS ARROW).<->
turns into↔
(U+2194 LEFT RIGHT ARROW) ifLeftRight
is configured.=>
turns into⇒
(U+21D2 RIGHTWARDS DOUBLE ARROW) ifDoubleArrow
is configured.<=
turns into⇐
(U+21D0 LEFTWARDS DOUBLE ARROW) ifDoubleArrow
is configured.<=>
turns into⇔
(U+21D4 LEFT RIGHT DOUBLE ARROW) if bothDoubleArrow
andLeftRight
are configured at a time.
Quotes
data CitationQuotes Source #
A set of quoting parentheses to be used by quoteCitation
function.
There are two presets: angleQuotes
and cornerBrackets
. These both
surround titles with a <cite>
tag. In order to disable surrounded
elements, set htmlElement
field to Nothing
, e.g.:
angleQuotes
{htmlElement
=Nothing
}
CitationQuotes | |
|
Instances
Show CitationQuotes Source # | |
Defined in Text.Seonbi.Punctuation showsPrec :: Int -> CitationQuotes -> ShowS # show :: CitationQuotes -> String # showList :: [CitationQuotes] -> ShowS # | |
Eq CitationQuotes Source # | |
Defined in Text.Seonbi.Punctuation (==) :: CitationQuotes -> CitationQuotes -> Bool # (/=) :: CitationQuotes -> CitationQuotes -> Bool # |
Pairs of substitute folk single and double quotes.
Used by transformQuote
function.
The are three presets: curvedQuotes
, guillemets
, and
curvedSingleQuotesWithQ
:
curvedQuotes
uses South Korean curved quotation marks which follows English quotes (‘
: U+2018,’
: U+2019,“
: U+201C,”
: U+201D)guillemets
uses North Korean angular quotation marks, influenced by Russian guillemets but with some adjustments to replace guillemets with East Asian angular quotes (〈
: U+3008,〉
: U+3009,《
: U+300A,》
: U+300B).curvedSingleQuotesWithQ
is the almost same tocurvedQuotes
but wrap text with a<q>
tag instead of curved double quotes.
A pair of an opening quote and a closing quote.
QuotePair Text Text | Wrap the quoted text with a pair of punctuation characters. |
HtmlElement HtmlTag HtmlRawAttrs | Wrap the quoted text (HTML elements) with an element like |
angleQuotes :: CitationQuotes Source #
Cite a title using angle quotes, used by South Korean orthography in horizontal writing (橫書), e.g., 《나비와 엉겅퀴》 or 〈枾崎의 바다〉.
cornerBrackets :: CitationQuotes Source #
Cite a title using corner brackets, used by South Korean orthography in vertical writing (縱書) and Japanese orthography, e.g., 『나비와 엉겅퀴』 or 「枾崎의 바다」.
curvedQuotes :: Quotes Source #
English-style curved quotes (‘
: U+2018, ’
: U+2019, “
: U+201C,
”
: U+201D), which are used by South Korean orthography.
curvedSingleQuotesWithQ :: Quotes Source #
Use English-style curved quotes (‘
: U+2018, ’
: U+2019) for single
quotes, and HTML <q>
tags for double quotes.
guillemets :: Quotes Source #
East Asian guillemets (〈
: U+3008, 〉
: U+3009, 《
: U+300A, 》
:
U+300B), which are used by North Korean orthography.
horizontalCornerBrackets :: Quotes Source #
Traditional horizontal corner brackets (「
: U+300C, 」
: U+300D,
『
: U+300E, 』
: U+300F), which are used by East Asian orthography.
horizontalCornerBracketsWithQ :: Quotes Source #
Use horizontal corner brackets (「
: U+300C, 」
: U+300D)
for single quotes, and HTML <q>
tags for double quotes.
:: CitationQuotes | Quoting parentheses to wrap titles. |
-> [HtmlEntity] | The input HTML entities to transform. |
-> [HtmlEntity] |
People tend to cite the title of a work (e.g., a book, a paper, a poem,
a song, a film, a TV show, a game) by wrapping inequality symbols
like <<나비와 엉겅퀴>>
or <枾崎의 바다>
instead of proper angle quotes
like 《나비와 엉겅퀴》
or 〈枾崎의 바다〉
.
This transforms, in the given HTML fragments, all folk-citing quotes into typographic citing quotes:
- Pairs of less-than and greater-than inequality symbols (
<
&>
) into pairs of proper angle quotes (〈
&〉
) - Pairs of two consecutive inequality symbols (
<<
&>>
) into pairs of proper double angle quotes (《
&》
)
:: Quotes | Pair of quoting punctuations and wrapping element. |
-> [HtmlEntity] | The input HTML entities to transform. |
-> [HtmlEntity] |
Transform pairs of apostrophes ('
: U+0027) and straight double
quotes ("
: U+0022) into more appropriate quotation marks like
typographic single quotes (‘
: U+2018, ’
: U+2019) and
double quotes (“
: U+201C, ”
: U+201D), or rather wrap them with an HTML
element like <q>
tag. It depends on the options passed to the first
parameter; see also Quotes
.
verticalCornerBrackets :: Quotes Source #
Vertical corner brackets (﹁
: U+FE41, ﹂
: U+FE42, ﹃
: U+FE43,
﹄
: U+FE44), which are used by East Asian orthography.
verticalCornerBracketsWithQ :: Quotes Source #
Use vertical corner brackets (﹁
: U+FE41, ﹂
: U+FE42) for single quotes,
and HTML <q>
tags for double quotes.
Stops: periods, commas, & interpuncts
A set of stops—period
, comma
, and interpunct
—to be used by
normalizeStops
function.
There are three presets: horizontalStops
, verticalStops
, and
horizontalStopsWithSlashes
.
horizontalStops :: Stops Source #
Stop sentences in the modern Korean style which follows Western stops. E.g.:
봄·여름·가을·겨울. 어제, 오늘.
horizontalStopsWithSlashes :: Stops Source #
Similar to horizontalStops
except slashes are used instead of
interpuncts. E.g.:
봄/여름/가을/겨울. 어제, 오늘.
normalizeStops :: Stops -> [HtmlEntity] -> [HtmlEntity] Source #
Normalizes sentence stops (periods, commas, and interpuncts).
transformEllipsis :: [HtmlEntity] -> [HtmlEntity] Source #
Until 2015, the National Institute of Korean Language (國立國語院) had
allowed to use only ellipses (…
) for omitted word, phrase, line,
paragraph, or speechlessness. However, people tend to use three or more
consecutive periods (...
) instead of a proper ellipsis.
Although NIKL has started to allow consecutive periods besides an ellipsis
for these uses, ellipses are still a proper punctuation in principle.
This transforms, in the given HTML fragments, all three consecutive periods into proper ellipses.
verticalStops :: Stops Source #
Stop sentences in the pre-modern Korean style which follows Chinese stops. E.g.:
봄·여름·가을·겨울。어제、오늘。
Dashes
transformEmDash :: [HtmlEntity] -> [HtmlEntity] Source #
Transform the following folk em dashes into proper em dashes
(—
: U+2014 EM DASH
):
- A hyphen (
-
:U+002D HYPHEN-MINUS
) surrounded by spaces. - Two or three consecutive hyphens (
--
or---
). - A hangul vowel
ㅡ
(U+3161 HANGUL LETTER EU
) surrounded by spaces. There are Korean people that use a hangul vowelㅡ
("eu") instead of an em dash due to their ignorance or negligence.