How can I sort -unique every other line in Linux?

Multi tool use


How can I sort -unique every other line in Linux?
I have a FASTA file and I want to get rid of redundancies only for the sequence (even number of lines) not the header.
>headerX
**SEQUENCE1**
>headerY
SEQUENCE2
>headerZ
**SEQUENCE1**
I want to get rid of the of the identical sequence (SEQUENCE1)
It's not clear what output you expect. Would you drop both lines
>headerZ
and **SEQUENCE1**
because sequence1 is already under >headerX
? --or-- would you keep >headerZ
, with no line under it? Please provide a sample output for the sample input.– Stephen P
yesterday
>headerZ
**SEQUENCE1**
>headerX
>headerZ
Please avoid "Give me the codez" questions. Instead show the script you are working on and state where the problem is. Also see How much research effort is expected of Stack Overflow users?
– jww
yesterday
1 Answer
1
You can use 'sed' for this,
sed -n 2~2p data.fasta | sort -u
This will print all of the even numbered lines in data.fasta and then sort the result to remove duplicates.
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Your question is very hard for mortals to understand. Could you try to make it clearer to non-FASTA, bio-humans please? Your title mentions "every other line" - are you showing us the line that needs changing or the other one? Or is it a trick?
– Mark Setchell
yesterday