User Tools

Site Tools


keller_and_evans_lab:unix_basics

Basics

moves or renames files

mv   file destination       

copies files

cp    file destination         

lists files in directory

ls
  ls -ltrah                                         

look at what processes are running on server (for CPU and RAM usage)

top                                      

look at what processes are using input/output onto hard disks

iotop                                   

see what processes mmkeller is running

ps -u mmkeller                

change/modify permissions; in this case, add read & write permissions to the group

chmod g+rw  directory   

change owner of directory to newguy

chown newguy directory 

change group of directory to km

chgrp km directory           

Look at huge files:

After you do the below, type -S (chop off long lines) and -N (put line number on lines)

less file

Many times you want tab delimited columns to easily read the file, here's how:

column -t file | less -SN

finding particular files:

ls -l *has*these*words*lastword | grep -v notthiswordthough   

“*” means anything in between these words

or get just the last 500 lines and 'pipe' them to less:

tail -n500 file | less

Count number of rows in a file:

wc -l file

Count number of columns in a file:

first line only:

awk -F ' ' '{print NF; exit}' file 

longest line in file:

awk '{ if (NF > max) max = NF } END { print max }' file

Webpage on looking at large files in unix: Look|at Large Files]

Basic file manipulation:

Remove the first 6 columns of a file:

cut -f 7- -d ' ' infile > outfile

Keep columns 400 through 897 of a file seaparated by commas:

cut -f 400-897 -d ',' infile > outfile

Keep only columns 1, 2, 3, 7, 8, & 10 of a file:

cut -f 1,2,3,7,8,10 file.ped > checkfile.ped

Keep all rows starting with the characters “UCL”:

grep '^UCL' input > output

Keep all rows that have matching characters in the file called 'IID-list':

grep -f IID-list input > output

Remove the top line of file1 and add that line to file 2:

head -n1 file1 > header
  cat header file2 > file3

change all 1'2 to 2's and 0's to 1's (important if wanting to use a 0/1 allelic codes in PLINK):

tr 1 2 < file1 > file2
  tr 0 1 file2 > file3

An alternative way to substitute a value using sed (like find and replace to all items in a file):

sed 's/string/cheese/g' < infile > outfile 

(Note, use a different file name for the outfile or else sed will return an empty file)

Using awk and perl to split a column into multiple columns:

file | awk '{print $6}' | perl -pe 's{/}{\t}g' > newfile

awk grabs the 6th column, and perl switches the / character into tab delimiters

file | awk 'NR>1 {print $6}' | perl -pe 's{/}{\t}g' > newfile

the same command but without including the column name

Using perl to change .xls columns into a flat file column:

perl -pi -e 's/\x0D/\n/g' copied_xls_column.txt > newfile.txt

merge two files using a common ID column (first column):

join file1 file2 > file3

__Remove first row of a flat file:

sed '1d' filename > filename2

Alternatively, for a file with 500 rows:

tail -n499 filename > filename2 

Links for file manipulation in UNIX: For working with large files, see this page: Large|Files in Unix]

And for text processing commands, see this page: Text|Processing in Unix]

Look at folder sizes:

du -h --max-depth 1

if you want to sort the above

du --max-depth 1 /home/ | sort -nr

Remove a lot of files in different folders

this will remove file ss3.out that exists in 100s of folders in the current directory

find ./ -name 'ss3.out' -exec rm {} \;

The words following the -exec option is the command that you want to execute i.e. rm in this case.

{\}\ is basically an indicator that the filenames returned by the search should be substituted here.

\; is the terminating string, and is required at the end of the command.

Remove files starting with “mm” that exists in 100s of folders in the current directory

find ./ -name mm* -exec rm {} \;   

Before doing the above, you might look to see if it is going to remove the files you want!

find ./ -name mm* > look     

List all files modified in the current directory (./) in last 9 days:

find ./ -mtime -9d -exec ls -lt {} \;

Same thing, but only R scripts modified in last 9 days:

find ./ -mtime -9d -name ".R" -exec ls -lt {} \;

Rename multiple files in a folder

Use the rename function. A perl expression must come first. E.g., to change all .txt files to .bak:

rename 's/\.txt/.bak/' *.txt 

To rename all files beginning with “NEW.” and change them to “OLD.”:

rename 's/^NEW./OLD./' NEW.* 

Copy a file over the network (see also rsync)

scp files.to.copy user@server.colorado.edu:~/myfolder/subfolder

Copy entire directory over the network (see also rsync)

scp -r directory.to.copy user@server.colorado.edu:~/myfolder/subfolder

Change default permissions for the group

Add this line to your ~/.bashrc file:

umask 002 #let group have read/write/execute permissions

Compressing files

A single file, converting the file to a compressed file

gzip filename.ext

A single file, leaving the original file unchanged

gzip -c filename.ext > new.zipped.file.gz

Multiple files

gzip -c filename.ext filename2.ext > new.zipped.file.gz 

Extract a gzipped file

gunzip new.zipped.file.gz  

A directory

tar -cvzf tar.file.name.tar directory

Extract a tar'd directory

tar -xvf tar.file.name.tar

Extract a tar.gz directory

tar -zxvf tar.file.name.tar.gz

Extract MULTIPLE tar.gz directories

for i in *.tar.gz; do tar -zxvf "$i"; done

Running multiple jobs at once in a shell script

 ./S800-loop1.sh &
 ./S800-loop2.sh &
 ./S800-loop3.sh &

Pausing for a given amount of time between starting processes in a shell script

 ./S800-loop1.sh &
 sleep 45m
 ./S800-loop2.sh &
 sleep 45m
 ./S800-loop3.sh &
 sleep 45m

Look at all files that have changed in last xx days

find /directory -type f -ctime -xx | more 

Look at multiple files at same time

less file1 file2 file3 
  :n forward to next file 
  :p backward to previous file 

Refresh .bashrc or .bash_profile files

After you have modified one of the files above, you need to refresh your OS so it uses the correct .bashrc or .bash_profile settings. You can either restart the computer, logout and log back in, or do this:

source ~/.bash_profile #assuming that you've modified .bash_profile 

Download files using command line

lwp-download http://pngu.mgh.harvard.edu/~purcell/plink/dist/plink-1.07-x86_64.zip 

OR

wget http://pngu.mgh.harvard.edu/~purcell/plink/dist/plink-1.07-x86_64.zip    

Copy (or move) files that have changed within the last 5 days

find ./ -mtime -5 -exec cp {} ~new/path/folder \;

Make sure that the target folder isn't in the folder being found; i.e., that ~/new/path/folder isn't in ./. Otherwise, you'll start trying to copy the contents of the folder itself back into the folder.

Get basic information about the computer or node that you are on

cat /proc/cpuinfo 
  cat /proc/cpuinfo | grep processor
 cat /proc/cpuinfo | grep processor | wc
keller_and_evans_lab/unix_basics.txt · Last modified: 2019/10/28 18:20 by lessem