What is the purpose of the awk command in relation to file.txt?

  • Thread starter frankliuao
  • Start date
In summary: A "assignment" is when you explicitly put a value in one of the fields. A reevaluation is when awk examines the value of the fields, and if the value of one of the fields has changed, it reevaluates the entire record.
  • #1
frankliuao
3
0
awk '{$1=$1}1' file.txt

This command can delete leading spaces of file.txt.

But why? Anyone can tell me what $1=$1 does and the single 1 does?

THanks,
 
Technology news on Phys.org
  • #2


That's just an awk idiom that forces it to recompute the value of $1. It's deleting leading spaces because by default, awk ignores them when reading values into vars. Personally I'd never use that syntax; it's ugly (it's intentionally obfuscated -- there is a more explicit way to write it) and won't always work depending on filename. Perl is better for that sort of job.

perl -pi.bak -e 's/^\s+//' file.txt

This will remove leading spaces and tabs from file.txt, making a backup of the original just in case as file.txt.bak.
 
  • #3


Here is another perl

perl -p -e 's/^ *//' file.txt

and a sed one

sed -i 's/^ *//' file.txt
 
  • #4
frankliuao: I have not tried your awk command yet (so I could be wrong), but the $1=$1 appears to redefine the beginning of the current line ($0) with the first string ($1); therefore, any leading spaces in $0 are removed. The "1" following the right-hand brace might be a typographic mistake, and therefore might be ignored (?). Does it make any difference in your output, if you remove the appended 1?

Actually, I am rather surprised if your awk command prints anything, because the default action is to print $0 if no action is supplied (the portion inside the braces). But your awk command contains an action, and it is not print $0. Therefore, it seems you would need to explicitly tell it to print $0, if you want it to print the lines. Otherwise, it would print nothing.
 
Last edited:
  • #5


You peeked my curiosity so I checked.

With the "1" it works.
Without it nothing is printed.

So it appears the "1" specifies that $0 should be printed.
I'm not sure where to find that in the manuals though.
 
  • #6
Interesting, I like Serena. Thanks for checking that. Did it really remove leading spaces? If so, does awk '' file.txt remove leading spaces, or not?
 
  • #7


awk '' file.txt does not print anything.
But awk '{$2=$2}1' file.txt has the same effect as awk '{$1=$1}1' file.txt.
In particular they also replace all non-leading sequences of white space by a single white space.

Edit: and both awk '1' file.txt and awk '{}1' file.txt have the same effect.
They print the original lines.
 
  • #8


This bugged me, so I started reading the manual.
My conclusion is that

awk '{$1=$1}1' file.txt

is short hand for:

awk '
{$1=$1}
1
' file.txt


That is, the first line automatically matches, since no pattern has been specified.
In the second line the "1" is an expression pattern that is evaluated as true, meaning it always matches.
Since the action is left out on this line, the action defaults to {print}.
 
  • #9


That leave the strange effect of {$1=$1}.
For that I found in the manual:

From the POSIX standard:
The awk utility shall denote the first field in a record $1, the second $2, and so on.
The symbol $0 shall refer to the entire record;
setting any other field causes the re-evaluation of $0.
Assigning to $0 shall reset the values of all other fields and the NF built-in variable.

It's not very specific, but apparently assigning to any of the fields $1, $2, etcetera, has the effect of joining all fields together and assigning that to $0.
 
  • #10


It is not "assigning" it to $0, it's reevaluating $0. I explained a few posts back, it's just some ugly awk idiom. Some people use it out of habit because they don't know what it does. Those who understand what it's doing only use it as a sort of shortcut to demonstrate their command of awk.

It's really just a bug in awk. $1=$1 makes the interpreter think you've changed $1, though you obviously haven't, so it reevaluates it. In so doing, leading spaces are trimmed because that's what awk does -- reads whitespace separated fields into $1..$n, trimming the whitespace as it does.
 
  • #11


justsomeguy said:
It is not "assigning" it to $0, it's reevaluating $0. I explained a few posts back, it's just some ugly awk idiom. Some people use it out of habit because they don't know what it does. Those who understand what it's doing only use it as a sort of shortcut to demonstrate their command of awk.

It's really just a bug in awk. $1=$1 makes the interpreter think you've changed $1, though you obviously haven't, so it reevaluates it. In so doing, leading spaces are trimmed because that's what awk does -- reads whitespace separated fields into $1..$n, trimming the whitespace as it does.

What's the difference between "assigning" and "reevaluating"?
And how do you explain that not only $1 is changed, but all other white space in the line is compressed as well?
 
  • #12


I like Serena said:
What's the difference between "assigning" and "reevaluating"?

An evaluation can change each time it is executed, that's what it means. Assignments, barring external influence, do not change their value (their evaluation).

Perhaps it's too subtle of a difference to get into.

I like Serena said:
And how do you explain that not only $1 is changed, but all other white space in the line is compressed as well?

That is what awk *does*. That is its purpose. All input to awk is processed into fields, by default, fields are separated by whitespace. This has nothing to do with the idiomatic assignment of one of the fields to itself.

" alice betty charlie dave" as input to awk always results in $1='alice', $2='betty', etc. unless you change the field separator (FS variable).
 

Related to What is the purpose of the awk command in relation to file.txt?

What does the command "Awk '{$1=$1}1' file.txt" do?

The command "Awk '{$1=$1}1' file.txt" is used to reformat the contents of a text file. It removes any leading or trailing whitespace and condenses multiple consecutive whitespace characters into a single space. It also adds a new line after each record.

What is the purpose of using Awk in this command?

Awk is a powerful text processing tool that allows for the manipulation and extraction of data from text files. In this command, Awk is used to reformat the contents of a text file to make it more readable or to prepare it for further processing.

What is the significance of '{$1=$1}1' in the command?

The '{$1=$1}1' portion of the command is known as an Awk script. It tells Awk to perform an action on each line of the input file. In this case, it is telling Awk to update the first field of each line (represented by $1) and then print the updated line (represented by 1).

Can this command be used on any type of text file?

Yes, this command can be used on any type of text file, as long as the file's contents are in a readable format. However, the command may not have the intended effect if the file is not structured in a way that allows for the first field to be updated.

Are there any limitations to using this command?

One limitation of using this command is that it only updates the first field of each line. If the file contains multiple fields and the user wants to update a different field, a different Awk script would need to be used.

Similar threads

  • Programming and Computer Science
Replies
3
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
6
Views
2K
  • Programming and Computer Science
Replies
2
Views
1K
  • Programming and Computer Science
Replies
1
Views
2K
Replies
2
Views
2K
Replies
3
Views
789
Replies
6
Views
1K
  • Science Fiction and Fantasy Media
Replies
2
Views
2K
Replies
3
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
2
Views
730
Back
Top