If the outline is:
Code: Select all
This
is
a sentence
.}
Code: Select all
This
This is
This is a sentence
This is a sentence .}
Code: Select all
This is a sentence .}
Here below is a script in the awk language that could do either assignment, depending on which line near the end is uncommented. Awk is available for Windows, linux, and most other operating systems.
Code: Select all
#!/usr/bin/gawk -f
# Copyright 2018 David A. Kra granting Creative Commons Share Alike license to all users. See https://creativecommons.org/licenses/by-sa/4.0/legalcode
# usage syntax #1: gawk -f deoutline.awk 'fullpathname' >deoutlined.txt
# usage syntax #2: ./deoutline.awk 'fullpathname' >deoutlined.txt
# Purpose: Converts a CMAPTools outline into a set of complete lines, where a "complete line" is one that ends with "}".
# Processes output produced by CMAPTools menu sequence: File | Export CMAP As... | CMAP Outline... .
# Input: outline: each line starts with n (0 <= n) groups of 4 blanks, then has a phrase. Each phrase is a concept or a relationship from the CMAP.
# Example:
# This
# is
# a sentence
# }
# Output: Replace each empty field with the matching field from the line above, whatever its length. Only output lines ending with "}".
# Example:
# This is a sentence }
# Optional Postprocessing: Use grep -v ABC <tn.txt >tn+1.txt # to eliminate lines containing ABC
BEGIN { FS = " {4}"; prevline[1] = ""}
{
for (i = 1; i < NF; i++) { if (length($i) < 1) $i = prevline[i] } # replace each empty field with the phrase from above, whatever its length.
for (i = 1; i <= NF; i++) { prevline[i] = $i } # does not handle empty line with 0 tokens; leaves uninteresting content in prevline left over from longer earlier lines
# print # prints every line
if (substr($NF,length($NF)) == "}") print # prints only lines that end in "}".
}