Substituting file contents in place with BASH and sed

Substituting file contents in place with BASH and sed

Created:05 Mar 2017 16:33:19 , in  Host development

The problem of substituting file contents in place on the command line is an old one. Often it creeps out when dealing with configuration or other text files. You just need a quick substitution , perhaps with sed, but what you end up with is frequently either none or a wrong one. Clearly, and for a good reason, the problem needs a little bit of extra attention.

On the command line a small group of characters, in particular when between double quotes, requires escaping with \ . For BASH *,[,],?,' and " have special meaning among others. Then there is sed. Sed when used for substitution, and without -r command line option, uses Basic Regular Expressions. This means $, ., *, /, [, \, ] and ^ have to be escaped with \ . On top of that there is a small issue of delimiter to take care of. Also, on the command line, it matters whether contents are placed in single or double quotes.

Considering all the above, suddenly the small problem now looks somehow larger than thought of initially.

It is not my goal to go into intricacies of BASH or sed any further here. I just need some reliable piece of code that will let me introduce small changes to a file contents in place on the command line quickly and efficiently. Below I'm attaching two functions I have come up with for exactly this purpose some time ago.

Substituting in place with sed

First function, sed_sub_in_place, uses sed for substitution. It escapes sed special characters in the search string. In replace / and & get escaped automatically.


# Usage : sed_sub_in_place search replace file 
# ^[\].*$ escaped in search (Basic Regular Expressions)
# / and & escaped in replace ( escape BRE special characters: ^[\].*$ with \ if present)
# makes copy of the original with the name file.tmp

sed_sub_in_place() {
  s=${1}
  s=${s//\\/\\\\}
  s=${s//\//\\\/}
  s=${s//\^/\\\^}
  s=${s//\[/\\\[}
  s=${s//\]/\\\]}
  s=${s//\./\\\.}
  s=${s//\*/\\\*}
  s=${s//\$/\\\$}
  s_w=${2//\//\\\/}
  s_w=${s_w//\&/\\\&}
  f="$3"
  s_e="s/$s/$s_w/g"
  sed --in-place=.tmp "$s_e" "$f"  
}

Here are some example of use:


# replacing < with &lt; in data.txt
sed_sub_in_place '<' '&lt;' data.txt

# replacing > with &gt; in data.txt
sed_sub_in_place '>' '&gt;' data.txt

# replacing " with &quot; in data.txt
sed_sub_in_place '"' '&quot;' data.txt 

# replacing ' with &apos; in data.txt
sed_sub_in_place "'" '&apos;' data.txt

# replacing '[div]' with '<div>' in data.txt
sed_sub_in_place '[div]' '<div>' data.txt

# replacing '[/div]' with '</div>' in data.txt
sed_sub_in_place '[/div]' '</div>' data.txt

# replacing SetEnvIfNoCase User-Agent (&lt;|<|%3C)(%20|s)*script(s|%20|>|&gt;|%3E) keep_out with REPLACED in data.txt

sed_sub_in_place 'SetEnvIfNoCase User-Agent (&lt;|<|%3C)(%20|s)*script(s|%20|>|&gt;|%3E) keep_out' 'REPLACED' data.txt

Substituting with BASH

A function below, bash_sub_in_place, escapes \*[] and ?. Escaping in replace is not necessary. Also, the function reads in whole file at one go, so it is not suitable for making substitutions in very large files..


# Usage: bash_sub_in_place search replace file
# \*[]? escaped in search, due to their special meaning for BASH
# nothing escaped in replace
# makes copy with the name file.timestamp.nanoseconds.tmp

bash_sub_in_place() {
  local s=${1//\\/\\\\}
  s=${s//\*/\\\*}
  s=${s//\[/\\\[}
  s=${s//\]/\\\]}
  s=${s//\?/\\\?}
  local s_w="$2"
  local f="$3"
  local d=$(date +"%s.%N")
  cp "$f" "$f.$d.tmp"
  local f_c=$(< $f )
  local fcr=${f_c//${s}/${s_w}}
  echo "$fcr" > "$f"
}

This function works in the same manner like sed_sub_in_place. Substitute function names and run examples above to convince yourself.

More complex substitutions

Both functions can be used in more sophisticated manner. One can, for example, make them substitute special characters in HTML for their predefined entities and back.

Here are examples:


# sed_htmlize takes one argument, file to make substitutions in
# substitutes &,<,>,",' for HTML character entities
sed_htmlize(){
  sed_sub_in_place '&' '&amp;' $1
  sed_sub_in_place '<' '&lt;' $1
  sed_sub_in_place '>' '&gt;' $1
  sed_sub_in_place '"' '&quot;' $1
  sed_sub_in_place "'" '&apos;' $1
}

# sed_unhtmlize takes one argument, file to make substitutions in
# substitutes HTML predefined entities for &,<,>,",'
sed_unhtmlize(){
  sed_sub_in_place '&amp;' '&' $1
  sed_sub_in_place '&lt;' '<' $1
  sed_sub_in_place '&gt;' '>' $1
  sed_sub_in_place '&quot;' '"'  $1
  sed_sub_in_place '&apos;' "'" $1
} 

As for use:


sed_htmlize data.txt

to turn &,<,>,",' into HTML predefined entities, and


sed_unhtmlize() data.txt

to get the original version of data.txt file back.

Safety of substituting in place

In short, substituting in place is neither safe nor recommended in most of cases. For the two functions presented here, sed_sub_in_place and bash_sub_in_place, make copy of the original file contents in file.timestamp.nanoseconds.tmp file. Make sure you check that your substitutions produce desired effects before deleting temporary files. Even better, work on copies of original files.

This post was updated on 27 Apr 2017 16:05:42

Tags:  BASH ,  sed 


Author, Copyright and citation

Author

Sylwester Wojnowski

Author of the above article, Sylwester Wojnowski, is sWWW admin and owner.He enjoys doing Maths and studying algorithms, writing code in scripting and command languages, Thrash Metal music and playing electric guitar.

Copyrights

©Copyright, 2019 Sylwester Wojnowski. This article may not be reproduced or published as a whole or in parts without permission from the author. If you share it, please give author credit and do not remove embedded links.

Computer code, if present in the article, is excluded from the above and licensed under GPLv3.

Citation

Cite this article as:

Wojnowski, Sylwester. "Substituting file contents in place with BASH and sed." From sWWW - Code For The Web . https://wojnowski.net.pl//main/index/substituting-file-contents-in-place-with-bash-and-sed