Delete text after ==

Question

Delete text after ==

Navigation

#1 by (3 votes)
#2 by (2 votes)
#3 by (1 votes)
#4 by (1 votes)

2

A requirements file pip contains all the packages installed in Python, so that the file can be used elsewhere and rebuild the original programming environment.

A requirements file looks like this:

alabaster==0.7.9
arrow==0.8.0
awesome-slugify==1.6.5
Babel==2.3.4
binaryornot==0.4.0
blessings==1.6

What I want is to remove the part that indicates the version, in the case of the first line alabaster==0.7.9 , delete the part ==0.7.9 and leave only alabaster .

I understand that finding a match creates two groups, but I can not make it work. I'm trying it in ubuntu using awk as follows.

When I ask for the first group:

$ awk -F"==" '{print $1}' base.txt

I get this:

alabaster==0.7.9
arrow==0.8.0
awesome-slugify==1.6.5

that is, the file is repeated.

When I ask for the second group with

$ awk -F"==" '{print $2}' base.txt

I only get 50 blank lines.

ADDITION:

Now I'm looking for this pattern (\w+)(==.) with what I do two match groups, I'm interested in the first one. But if the package is called python-mimeparse there is no match anymore. You should be able to add scripts in case a package is called paquete_python or paquete-python .

Addendum 2

This expression (.+)(==)(.+) finds three groups, the first is the package (which is what I'm looking for) and the third is the version. Now I just need to know how to use it in awk .

third edition

I published an answer that solves the problem in Python, but the idea is that the solution is applied with some other tool such as awk , gawk , sed or even perl .

There are several options in this SOEN publication, but do not I have been able to use my search pattern in none. I do not get errors, but I do not get any results either.

Alguas considerations:

I'm looking for just the name of the package , not the version
There is no package installed, so there is nothing to update
The solution can use another tool, such as sed or grep

regex unix awk sed

asked by toledano 10.03.2017 в 19:26

source

4 answers

3

A. VALUES OF THE LEFT OF ==

Option 1.

Capture everything that is before ==

^.*?(?=\=\=)/gm

Option 2.

Make a match without capturing the group from ==

Thanks @fedorqui

^.*?(?:==)/gm

DEMO

Result

alabaster
arrow
awesome-slugify
Babel
binaryornot
blessings

B. VALUES OF THE RIGHT OF ==

=.*

DEMO

Result

==0.7.9
==0.8.0
==1.6.5
==2.3.4
==0.4.0
==1.6

answered by 10.03.2017 в 19:56

1

Try this command in Bash:

cat requirements.txt | grep -oP "\w+[-_]{0,1}\w+"

requirements.txt would be the pip requirements file.

The important thing is the regular expression to use and the one that I include includes the requirement of the hyphen or dash separator; I updated the example of @A. Cedano so you can see it live here .

If you need to save the result to a file (surely yes), you can obviously use the output redirection; that is:

cat requirements.txt | grep -oP "\w+[-_]{0,1}\w+" > salida.txt

I hope you serve, greetings.

answered by 10.03.2017 в 21:03

1

The alternative in Python is as follows:

import re


r = re.compile('(?P<paquete>.+)(==)(?P<version>.+)'
for l in open('base.txt').readlines():
    print (r.search(l).group('paquete'))

The re module that handles regular expressions is imported into the first line.
Since we are going to apply the same search to all the lines, we create an object pattern or search pattern with the expression we are looking for:
- The first group has name paquete and is formed with any character and any amount.
- The second group is only a separator, formed by the signs of equality.
- The third group is named version and is formed by the rest of the characters after the second group.
We go through the requirements file, line by line,
And we pass it as an argument to the search (which uses the previously compiled object) and only the result of the group paquete (ie the match ) is printed.

answered by 10.03.2017 в 21:45

Is it possible to clone a Date in Javascript? Case mysql error

score 2 · Accepted Answer

The awk -F'==' '{print $1}' archivo solution uses a field separator ( FS ) with multi-characters. This is valid as long as you are using a version of awk compatible with POSIX. For example, on Solaris it will not work.

So the question is: how to make it work?

So let's simplify: the file consists of lines of the form módulo==versión . Therefore, what we can do is eliminate = and everything that follows:

$ cut -d'=' -f1 fichero
alabaster
arrow
awesome-slugify
Babel
binaryornot
blessings

This is saying: separate the line based on = as a separator ( -d= ) and print the first resulting field ( -f1 ).

It may be a bit fragile, so you can also choose to use sed :

sed 's/=.*//' fichero

This does the same thing: delete from the first symbol = . However, it allows you to extend the command to something more complex, such as:

$ sed '/==/s/=.*//' fichero
alabaster
arrow
awesome-slugify
Babel
binaryornot
blessings

That performs this substitution only on lines that contain == . And if you hurry me, you can say:

sed -n '/==/s/=.*//p' fichero

To print only these lines ( -n inhibits printing by default and p prints the current line).

If you really want to use% co_of% awk, use:

$ awk 'match($0, /^(.+)==(.+)/, res) {print res[1]}' fichero
alabaster
arrow
awesome-slugify
Babel
binaryornot
blessings

As you can see, the syntax is match() . Therefore, it is a question of capturing those that interest us: in this case only the first, so in fact we could limit ourselves to saying match(línea, patrón, matriz de resultados) , without needing to capture the rest.

In a nutshell: match($0, /^(.+)==/, res) does not seem like the best solution here because depending on which environments the field separator with multi-characters may give you problems. Make your life easy by using awk in this case: you do not need to use such complex regular expressions when a sed sencillito already gives you everything you need.