Concatenate a lot of different files

About writing shell scripts and making the most of your shell
Forum rules
Topics in this forum are automatically closed 6 months after creation.
Locked
omegatech

Concatenate a lot of different files

Post by omegatech »

Hello all,

I can't find any good clues here or elsewhere on the web, so I think I'm better off asking some professionals about this.

I'm having trouble putting together a script that can join a lot of files for me. I can manually type in all commands to that script (paste in and change numbering), but that's really tiring. My hands are sore after typing a few hundred of these.

The files are split rar-files that has to be joined.
It looks like this:

B74_ABC_001.rar.001 B74_ABC_001.rar.002 B74_ABC_001.rar.003
MX3_EFG_020.rar.001 MX3_EFG_020.rar.002 MX3_EFG_020.rar.003
MX3_EFG_021.rar.001 MX3_EFG_021.rar.002 MX3_EFG_021.rar.003

The length of the split varies from 2 files up to about 500.

Right now I type this into bash (with a lot of help from autocomplete) for a manual job:

Code: Select all

cat B74_ABC_001.rar.* >> B74_ABC_001.rar

Is there any simple way to accomplish this? A universal script to handle examples like these would be perfect.

Thanks in advance.
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
omegatech

Re: Concatenate a lot of different files

Post by omegatech »

Thanks for your reply, gmilo2.

Unfortunately, that script does not work entirely well in my situation. It will unpack all the abc.rar.001 files, but not the follwing parts of one archive (.rar.002 and so on). They have to be joined to create the complete rar-file. Also, this script unpacks the files, which I don't want. I just want them to be joined to form a complete archive.

You can entirely disregard that this is rar files. It could be any way, shape or form of a split file (text, audio, zip etc).
richyrich

Re: Concatenate a lot of different files

Post by richyrich »

IIRC , unraring the first file (in the same dir as all the others) will automatically seek the following rar files and continue to unrar until the complete archive is unpacked. At least that is how multiple rar archives used to work.
omegatech

Re: Concatenate a lot of different files

Post by omegatech »

Yes, they are usually, but that is if they're split with rar in the first place. They are commonly named "abc.part01.rar abc.part02.rar abc.part03.rar". These archives started life as a single file and was then split, alas, there's no rar-header in the other files (just the 001 file).
seawolf167

Re: Concatenate a lot of different files

Post by seawolf167 »

gmilo2 wrote:Could the files have been split with the split command? As in, the source file was transformed into two RAR files and then they ran the split command on the second(?) RAR file?
This is what I was wondering. The files listed have two numbers:

Code: Select all

B74_ABC_001.rar.001 B74_ABC_001.rar.002 B74_ABC_001.rar.003
The first comes before the .rar, indicating a multiple rar archive. The second number is after the .rar, indicating perhaps the split command was used. Something like this may work:

Code: Select all

join B74_ABC_001.rar.*
unrar e -r B74_ABC_001.rar
omegatech

Re: Concatenate a lot of different files

Post by omegatech »

I think I will do the spreadsheet-style of putting this together for now, just to get some of them done.


The numbering is quite simple, but might look distracting. You only have to join the files with sequential suffixes, everything before .rar is a complete file.

If you have a file, "abcdef_123.rar", it would have been split with:

Code: Select all

split -b 128000 -a 3 -d abcdef_123.rar abcdef_123.rar.
The type of file could be just about anything, my example just happen to be a RAR-file. The files are sent through a radio system that has a strict limit on file size.
omegatech

Re: Concatenate a lot of different files

Post by omegatech »

I think I've come up with a solution, at least it seems to work. It's a two part script; one to generate a list of files to cat and then a script to perform it.

Part one - generate a list of files filtered with my rules (only list the first split file and then strip off suffix):

Code: Select all

#!/bin/bash

ls *.001 | sed 's/\(.*\)\..*/\1/' >> list.txt
Part two - read the list file and use cat to generate the complete files:

Code: Select all

#!/bin/bash

while read line
do
    name=$line
    echo "Processing: $name"
    cat "$name".*"" >> $name
done < $1
You must pass the list.txt to the second script (./doit.sh list.txt).

So far this seems to work, although it might not be high quality and could possibly fail horribly ;)
lmuserx4849

Re: Concatenate a lot of different files

Post by lmuserx4849 »

Basically use bash globbing (different from regular expressions), then use paramater expansion to remove the part of the
file name that you do not want. What is left is the "pattern", which you want in the cat command. Look in the bash man page under "Parameter Expansion". The format you are looking for is "${parameter%word}" to remove the trailing characters. In the example
below you could create the cat command file in the first loop. I chose to take another step.

Here's one example:

Code: Select all

#!/bin/bash
# Name: create-combine-rar-script.sh 
# Description:  Create "concatenate a lot of different files" script.
#
# Note: This is an example for educational purposes.Please modifiy and use as
#       needed. Tested on bash 4.0.
#
# Resources: http://mywiki.wooledge.org/BashGuide, http://linuxcommand.org/
#
#                         fName
#                  |.................|
# File Pattern is: B74_ABC_001.rar.003
#                  |.........||......|
#                      pat     unpat
#                   patterns
# 
declare -r unpat='.rar.[0-9][0-9][0-9]'
declare -r outFile="./concatenate-commands.sh"
declare -r newFileSuffix='.rar.new'
declare -A patterns=()
declare -- fName='' pat=''
declare -i cnt=1
set -u

# Create an array of file patterns.
# Let bash associative array take care of uniqueness.
#
while read -re fName; do
  # get just the pattern from file name
  pat="${fName%""${unpat}""}"

  # check to see if pattern has been added yet
  if [[ -z "${patterns["${pat}"]+set}" ]]; then
    patterns["${pat}"]=''
    # could build cat command here.
  fi
done < <(find ./ -maxdepth 1 -type f -name '*'"${unpat}" -printf '%f\n')

# Quick way to display/debug a variables content
# declare -p patterns

# Header of output file
#
cat <<_EOF_ >"${outFile}"
#!/bin/bash
#===============================================================================
# Name: ${outFile}
# Description: Concatenate a lot of different files.
# Created by: ${0##*/} on $($(type -P date) '+%a, %F %T')
# Note: This script is intended to be run once unless output files are removed.
#===============================================================================
_EOF_

# For each pattern, create command
#
for pat in "${!patterns[@]}"; do
  printf -- '# %d - Pattern: %s\n' $((cnt++)) "${pat}"
  printf -- '%s %s%s >> %s%s\n' \
    "$(type -P cat)" "${pat}" "${unpat}" "${pat}" "${newFileSuffix}"
  printf -- '\n'
done >> "${outFile}"

chmod u+x "${outFile}"
Locked

Return to “Scripts & Bash”