public inbox for blinux-list@redhat.com
 help / color / mirror / Atom feed
* use of comm and sort tools
@  blinux-list
   ` blinux-list
   ` blinux-list
  0 siblings, 2 replies; 6+ messages in thread
From: blinux-list @  UTC (permalink / raw)


I wrote previously about ffmpeg and audio variable bitrate. After conversion, there are some files that did not convert. I would like to compare two listings and discover which ones are missing. So, we have these commands:
find . -type f -name \*.m4a | sed -e 's at .*/@@' -e 's/\.4a$//' > m4a.txt
find raw2 -type f -name \*.mp3 | sed -e 's at .*/@@' -e 's/\.mp3$//' > mp3.txt
Now I want to run comm and have it dump to another file which lines in m4a.txt do not exist in mp3.txt. How would I go about doing that? Or is there a better way?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* use of comm and sort tools
   use of comm and sort tools blinux-list
@  ` blinux-list
     ` blinux-list
   ` blinux-list
  1 sibling, 1 reply; 6+ messages in thread
From: blinux-list @  UTC (permalink / raw)


Strip off the extentions using basename, then use diff.
for i in $(cat m4a.txt);do
basename $i .m4a >> first.txt
done

for i in $(cat mp3.txt);do
basename $i .mp3 >> second.txt
done
  diff first.txt second.txt
If your filenames contain spaces, the for loops above will not work.
Use the detox program to fix that first.
Regards, Willem


On Thu, 12 May 2022, Linux for blind general discussion wrote:

> I wrote previously about ffmpeg and audio variable bitrate. After conversion, there are some files that did not convert. I would like to compare two listings and discover which ones are missing. So, we have these commands:
> find . -type f -name \*.m4a | sed -e 's at .*/@@' -e 's/\.4a$//' > m4a.txt
> find raw2 -type f -name \*.mp3 | sed -e 's at .*/@@' -e 's/\.mp3$//' > mp3.txt
> Now I want to run comm and have it dump to another file which lines in m4a.txt do not exist in mp3.txt. How would I go about doing that? Or is there a better way?
>
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://listman.redhat.com/mailman/listinfo/blinux-list
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* use of comm and sort tools
   use of comm and sort tools blinux-list
   ` blinux-list
@  ` blinux-list
     ` blinux-list
     ` blinux-list
  1 sibling, 2 replies; 6+ messages in thread
From: blinux-list @  UTC (permalink / raw)


Tim here.  If both files have been sorted, you can find just the
missing ones with

  $ comm -23 mp3.txt m4a.txt

The "-23" means "things that aren't only in file 2, and things that
aren't in common".

If they're unsorted, I usually reach for awk:

  $ awk 'NR==FNR{b[$0]; next} !($0 in b)' mp3.txt m4a.txt

(note the order reversal of the arguments: first it loads the list of
all the mp3s and then processes m4a.txt, emitting items that weren't
in the mp3 list)

Hope this helps,

-Tim

On May 12, 2022, Linux for blind general discussion wrote:
> I wrote previously about ffmpeg and audio variable bitrate. After
> conversion, there are some files that did not convert. I would like
> to compare two listings and discover which ones are missing. So, we
> have these commands: find . -type f -name \*.m4a | sed -e 's at .*/@@'
> -e 's/\.4a$//' > m4a.txt find raw2 -type f -name \*.mp3 | sed -e
> 's at .*/@@' -e 's/\.mp3$//' > mp3.txt Now I want to run comm and have
> it dump to another file which lines in m4a.txt do not exist in
> mp3.txt. How would I go about doing that? Or is there a better way?
> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://listman.redhat.com/mailman/listinfo/blinux-list
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* use of comm and sort tools
   ` blinux-list
@    ` blinux-list
  0 siblings, 0 replies; 6+ messages in thread
From: blinux-list @  UTC (permalink / raw)


This worked for me without the temporary text files.


for f in *m4a; do if test \! -f "${f%%.m4a}.mp3"; then echo "$f exists, 
but ${f%%.m4a}.mp3 does not."; fi; done

Works for me using bash and zsh, your mileage may vary if you use a 
different shell.

~Kyle


^ permalink raw reply	[flat|nested] 6+ messages in thread

* use of comm and sort tools
   ` blinux-list
@    ` blinux-list
     ` blinux-list
  1 sibling, 0 replies; 6+ messages in thread
From: blinux-list @  UTC (permalink / raw)


Tim's suggestion may be best, since mine made the assumption that you 
are in the current directory where your mp3 and m4a files are, and it 
also made the assumption that you have all the m4a files you are 
supposed to have, but are missing some of the mp3 files. Of course you 
can reverse things to look for missing m4a files, and you can put */ in 
front of the file if your directory structure is different. My main goal 
was to try to eliminate any temporary files and just print the m4a files 
that don't have matching mp3.

~Kyle


^ permalink raw reply	[flat|nested] 6+ messages in thread

* use of comm and sort tools
   ` blinux-list
     ` blinux-list
@    ` blinux-list
  1 sibling, 0 replies; 6+ messages in thread
From: blinux-list @  UTC (permalink / raw)


This is what I was after. Thanks.

----- Original Message -----
From: Linux for blind general discussion <blinux-list at redhat.com>
To: blinux-list at redhat.com
Date: Thu, 12 May 2022 07:14:38 -0500
Subject: Re: use of comm and sort tools

> Tim here.  If both files have been sorted, you can find just the
> missing ones with
>
>   $ comm -23 mp3.txt m4a.txt
>
> The "-23" means "things that aren't only in file 2, and things that
> aren't in common".
>
> If they're unsorted, I usually reach for awk:
>
>   $ awk 'NR==FNR{b[$0]; next} !($0 in b)' mp3.txt m4a.txt
>
> (note the order reversal of the arguments: first it loads the list of
> all the mp3s and then processes m4a.txt, emitting items that weren't
> in the mp3 list)
>
> Hope this helps,
>
> -Tim
>
> On May 12, 2022, Linux for blind general discussion wrote:
> > I wrote previously about ffmpeg and audio variable bitrate. After
> > conversion, there are some files that did not convert. I would like
> > to compare two listings and discover which ones are missing. So, we
> > have these commands: find . -type f -name \*.m4a | sed -e 's at .*/@@'
> > -e 's/\.4a$//' > m4a.txt find raw2 -type f -name \*.mp3 | sed -e
> > 's at .*/@@' -e 's/\.mp3$//' > mp3.txt Now I want to run comm and have
> > it dump to another file which lines in m4a.txt do not exist in
> > mp3.txt. How would I go about doing that? Or is there a better way?
> >
> > _______________________________________________
> > Blinux-list mailing list
> > Blinux-list at redhat.com
> > https://listman.redhat.com/mailman/listinfo/blinux-list
> >
>
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://listman.redhat.com/mailman/listinfo/blinux-list
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~ UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
 use of comm and sort tools blinux-list
 ` blinux-list
   ` blinux-list
 ` blinux-list
   ` blinux-list
   ` blinux-list

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).