How to get the characters of a record composed of several columns with awk? [closed]

1

I have a file with more than 1000 records which come from this way:

AAAA 100141167                100141167                     100141167            60000000069615260000000026093000000000495700000000310500000000031050B/V
ESAC 00100  100021110                100021110                     100021110            0000000564111270000000026815000000000509500000000319100000000031910B/V
ESAC 00100  10002120K                10002120K                     10002120K            0000000170389970000000023278000000000442300000000277010000000027701B/V
ESAC 00100  100021218                100021218                     100021218            0000000051235480000000019160000000000364000000000228000000000022800B/V
ESAC 00100  100021269                100021269                     100021269            0000000612372560000000031252000000000593800000000371900000000037190B/V
ESAC 00100  100021285                100021285                     100021285            0000000349449790000000013445000000000255500000000160000000000016000B/V
ESAC 00100  100021315                100021315                     100021315            0000000407589910000000030924000000000587600000000368000000000036800B/V
ESAC 00100  10002165K                10002165K                     10002165K            0000000013006070000000011400000000000000000000000114000000000005000FAC
ESAC 00100   7557068675570686                                     2000306451935862066        20003064510000000008100000000000190000000000100000000000010000B/V
ESAC 00100   7557068675570686                                     2000306451935862066        20003064510000000008100000000000190000000000100000000000010000FAC

I need to add 13 characters which are before B / V or FAC for each record.

I am implementing a Shell in which I want to store the result in a variable to make comparisons.

This is my code:

var='awk 'BEGIN {suma=0} NR>1 {suma=suma+(substr($5,47,13)) } END {printf( "%.0f", suma)}'

but I do not find all the files because they vary the positions, so I need a code that does it from B / V or FAC backwards, that is, going back and taking the 13 characters before B / V or FAC and Add them for each record.

    
asked by Ricardo 04.07.2018 в 21:16
source

1 answer

2

You indicate that you want to get 12 characters from position 143. Now, the rule you mention is also the 12 characters before B / V or FAC , so I focus on it to make it more manageable and generic.

While Awk is usually quite powerful in many things, I think we can "clean up" the entry here by using Grep with its look ahead to get the 12 characters before "B / V or FAC":

$ grep -Po '[0-9]{12}(?=B/V|FAC)' fichero
000000031050
000000031910
000000027701
000000022800
000000037190
000000016000
000000036800
000000005000

Afterwards, it's all about adding them with Awk:

... | awk '{suma+=$1} END{print suma+0}'

Using suma+0 in case it turns out that there is no line and so ensure that you always return a result.

Everything together, and using your input file, would give us this result:

$ grep -Po '\d{12}(?=B/V|FAC)' fichero | awk '{suma+=$1} END{print suma+0}'
208451

If you want to save it in a variable, use var=$(comando) . That is, variable=$(grep -Po '\d{12}(?=B/V|FAC)' fichero | awk '{suma+=$1} END{print suma+0}') .

In the comments we have seen that you are working on AIX, without GNU awk or GNU grep or GNU sed, which makes my suggestions not useful. In this case, what finally solved the problem is to use:

awk '{if (length($5)> 63) {suma=suma+(substr($5,50,13))} else {suma=suma+(substr($5,47,13))}} END {printf("%.0f\n", suma)}' fichero

That, expanded, is:

awk '{
        if (length($5)> 63) {suma=suma+(substr($5,50,13))}
        else {suma=suma+(substr($5,47,13))}
     }
     END {printf("%.0f\n", suma)}' fichero
    
answered by 06.07.2018 в 12:16