Viewing 28 posts - 1 through 28 (of 28 total)
  • Calling all *NIX geeks – help!
  • woody2000
    Full Member

    I have 2 directories containing data files that differ in name. I want to copy files that differ from the source directory to the destination directory, but I need them to keep the name that already exists in the destination directory.

    For example, in the source directory I have a file called emplo00171.dat and in the destination directory the file is called emplo00106.dat, so I would need to copy the file and keep the "new" name. The filenames are identical in each directory barring the last 3 characters.

    I'm sure rsync or similar can do just this, but my brain just isn't working today and I can't figure it out. Any suggestions? SUSE Linux BTW.

    Ta 😕

    AdamW
    Free Member

    You are a bit confusing…

    If the files are the same then I'd be tempted to do a chksum against each file (or md5sum) to make sure that file A == file B then choose the name accordingly.

    woody2000
    Full Member

    Sorry, what do I need to clarify?

    coffeeking
    Free Member

    IS the mapping constant? i.e. 171 > 106, 172 > 107 etc?

    woody2000
    Full Member

    Unfortunately not ck…!

    zigzag69
    Free Member

    Following should work under ksh. BACKUP YOUR FILES BEFORE YOU RUN IT!

    EDIT: Oh bugger, losing format. Sent you an email.

    coffeeking
    Free Member

    Doesnt this pose the slight problem that you have non-identical files being mapped to non-sequential numbering – i.e. just copying a random bunch of filenames to new filenames? I suppose I'm confused as to:

    Dir 1:
    Files with X contents, named Y

    Dir 2:
    Files with F contents, named Z

    you effectively just have 2 different sets of files, and I don't see how they link to have their replacement done? Ultimately what it seems you're looking for is a script/program that has a lookup of each files corresponding second name, and then copy it across?

    grahamb
    Free Member

    something like ….

    cd source
    for i in * ; do
    j=$(echo $i | sed 's/…\.dat//')
    k=$(ls ../destination/$j*) && cp -i $i $k
    done

    allthepies
    Free Member

    Oooo, nice shell skilz 😉

    woody2000
    Full Member

    Skillz indeed, I shall do some testing! Cheers all

    grahamb
    Free Member

    I tested it very briefly here on a RHEL system before posting, so should be ok for SLES.

    It'll break if you have more than one file in the destination directory that matches the wildcard. You'd need to do something with the "ls" to sort the file you want in that case, like piping the output through sort & tail/head …
    k=$(ls ../destination/$j* | sort -n | tail -1)

    AdamW
    Free Member

    But this will copy in files that are the same. That is where I got confused – I thought the OP said the files were the same but with different names.

    If you have a file with contents X in both directories A and B then you may end up with multiple copies. That is why I suggested cksum or md5sum to check to see if they had the same contents.

    Or am I just confused here? I usually am.

    buzz-lightyear
    Free Member

    Me too.

    In the example given: emplo00171 ~ emplo00106. But the content of source/emplo00171.dat is fresher, so you want to update the content of destination/emplo00106.dat, right?

    How does a shell-script know that emplo00171 ~ emplo00106?

    Maybe I'm being dim also so am curious to understand this!

    grahamb
    Free Member

    Agreed it's confusing.

    I read it that the OP said that the last 3 chars of the filename will be different, the rest is unique. I assumed by that he meant that the directories have say in dest & src ….

    foo00234 ~ foo00123
    bar00567 ~ bar00456

    and that he wanted to overwrite foo00234 in the dest directory with foo00123 from src, same for bar00567 ~ bar00456.

    Yes, i didn't bother checking the contents if they match (i forgot that bit 😉 But if they do, it'll just use up a bit of disk bandwidth. An md5sum would be easy to add before the cp …

    woody2000
    Full Member

    Basically, the destination directory contains empty data files created by another shell script, the names of which are important to a database manager program. The source directory contains the data I want, but with the "wrong" filenames. I'm having some database connection issues which seem to point to some kind of low level permissions issue (the UNIX file permissions are wide open but I still can't connect to the database with SQL), so I've recreated the tables and now I want to just copy over the data files with the "right" name. I could just unload/reload the data, but there are a LOT of files! It's an Informix C-ISAM database if anyone's interested.

    Any clearer………?

    Thought not! 🙂

    buzz-lightyear
    Free Member

    "last 3 chars of the filename will be different, the rest is unique"

    Indeed, but that didn't stack up with the example given [scratches head].

    I assumed that the source and destination file contents must differ, because if the contents currently match (md5sum check), and he wants to preserve the destination filename – what is his purpose in copying at all?

    Arguably this is this a microcosm of the problems facing software engineers: weak problem specification 😀

    Nifty script BTW.

    EDIT: OK the destination files are empty. Come on Woody my Toy Story mate, explain how you know which source filenames should match which destination filenames?

    allthepies
    Free Member

    >Indeed, but that didn't stack up with the example given [scratches head].

    I had assumed that one had to ignore the filename suffix i.e strip off the .dat and then remove the last three chars from the resultant filename

    so:-

    >emplo00171.dat and in the destination directory the file is called emplo00106.dat

    Remove the ".dat" suffix and then remove the 171 and 106 chars from each filename and you get the match.

    Very confusing OP though 🙂

    damion
    Free Member

    Here's my punt at the logic:

    foreach srcfile in srcdir
    foreach destfile in destdir
    if !diff srcfile destfile
    cp srcfile destfile
    fi
    done
    done

    Please excuse the perl/shell muddle I'm in a mixed up world at the moment. If the logis's right I'll spend some thought on it.

    Damion.

    woody2000
    Full Member

    Sorry!

    I'm confused too – I'd normally go and bother a programmer, but I'm just trying to figure it out for myself. allthepies – there's a corresponding .idx file too, so stripping the suffix is a no go.

    Buzz – the filenames only differ by the last 3 digits, so for each file in the source directory, there's a corresponding file in the destination directory with a slightly different name.

    Eg:

    /src/file100101.dat /dest/file100102.dat
    /src/file200101.dat /dest/file200101.dat

    and so on. Any better?

    allthepies
    Free Member

    for future info a "filename" is the whole shebang, including the .dat / .idx / whatever suffix. So when you mention the last three chars of a filename then programmers will assume you mean the suffix (.dat/.idex etc)

    damion
    Free Member

    Does that mean we're not interested in the contents? So:

    cp src/file1xxx dst/file1yyy
    cp src/file2xxx dsr/file2yyy

    assuming that there wouldn't be a file1zzz?

    EDIT: Oh thats not clear either, Damn it, I'm going back to my crontab now…

    woody2000
    Full Member

    Sorry – I told you my brain was out of order today!

    damion – I could just copy each one by hand, but there's a lot of them!

    samuri
    Free Member

    Damion's original post has the logic right from what I'm reading. Seems pretty simple to me. i'll knock up a script later tonight if no-one else already has.

    damion
    Free Member

    if we're only matching on everything bar the last three char before the suffix, then you could generate a filelist, match then copy. If thats what you want, give me a minute….

    woody2000
    Full Member

    damion – I think that's about the long and the short of it, cheers

    damion
    Free Member

    woody YGM.

    I've got to head off now, so if its not what you needed Samuri its over to you….

    buzz-lightyear
    Free Member

    Gotcha!

    samuri
    Free Member

    not sure if this has been answered but I just knocked this up

    I like to keep shell scripts simple so they're easy to edit so no clever scripting skillz here.

    Obviously you'll need to edit this for the source and destination directories and for the lengths of the filename sections but as long as the filename lengths are consistent this will work. You can always apply your own suffix if there are multiple ones.

    edit: oh ffs! The phorum code is interpreting the script as html and I can't be bothered working my way through it.

    It's here
    http://www.samuri.co.uk/junk/script.txt

Viewing 28 posts - 1 through 28 (of 28 total)

The topic ‘Calling all *NIX geeks – help!’ is closed to new replies.