====== Basics ====== moves or renames files mv file destination copies files cp file destination lists files in directory ls ls -ltrah look at what processes are running on server (for CPU and RAM usage) top look at what processes are using input/output onto hard disks iotop see what processes mmkeller is running ps -u mmkeller change/modify permissions; in this case, add read & write permissions to the group chmod g+rw directory change owner of directory to newguy chown newguy directory change group of directory to km chgrp km directory ====== Look at huge files: ====== After you do the below, type -S (chop off long lines) and -N (put line number on lines) less file Many times you want tab delimited columns to easily read the file, here's how: column -t file | less -SN finding particular files: ls -l *has*these*words*lastword | grep -v notthiswordthough "*" means anything in between these words or get just the last 500 lines and 'pipe' them to less: tail -n500 file | less Count number of rows in a file: wc -l file Count number of columns in a file: first line only: awk -F ' ' '{print NF; exit}' file longest line in file: awk '{ if (NF > max) max = NF } END { print max }' file Webpage on looking at large files in unix: [[[http://www.thegeekstuff.com/2009/08/10-awesome-examples-for-viewing-huge-log-files-in-unix/|Look|at Large Files]]] ====== Basic file manipulation: ====== Remove the first 6 columns of a file: cut -f 7- -d ' ' infile > outfile Keep columns 400 through 897 of a file seaparated by commas: cut -f 400-897 -d ',' infile > outfile Keep only columns 1, 2, 3, 7, 8, & 10 of a file: cut -f 1,2,3,7,8,10 file.ped > checkfile.ped Keep all rows starting with the characters "UCL": grep '^UCL' input > output Keep all rows that have matching characters in the file called 'IID-list': grep -f IID-list input > output Remove the top line of file1 and add that line to file 2: head -n1 file1 > header cat header file2 > file3 change all 1'2 to 2's and 0's to 1's (important if wanting to use a 0/1 allelic codes in PLINK): tr 1 2 < file1 > file2 tr 0 1 file2 > file3 An alternative way to substitute a value using sed (like find and replace to all items in a file): sed 's/string/cheese/g' < infile > outfile (Note, use a different file name for the outfile or else sed will return an empty file)\\ Using awk and perl to split a column into multiple columns: file | awk '{print $6}' | perl -pe 's{/}{\t}g' > newfile awk grabs the 6th column, and perl switches the / character into tab delimiters file | awk 'NR>1 {print $6}' | perl -pe 's{/}{\t}g' > newfile the same command but without including the column name Using perl to change .xls columns into a flat file column: perl -pi -e 's/\x0D/\n/g' copied_xls_column.txt > newfile.txt merge two files using a common ID column (first column): join file1 file2 > file3 __Remove first row of a flat file: sed '1d' filename > filename2 Alternatively, for a file with 500 rows: tail -n499 filename > filename2 Links for file manipulation in UNIX: For working with large files, see this page: [[[http://compute.cnr.berkeley.edu/cgi-bin/man-cgi?largefile+5|Large|Files in Unix]]] And for text processing commands, see this page: [[[http://tldp.org/LDP/abs/html/textproc.html|Text|Processing in Unix]]] \\ ====== Look at folder sizes: ====== du -h --max-depth 1 if you want to sort the above du --max-depth 1 /home/ | sort -nr ====== Remove a lot of files in different folders ====== this will remove file ss3.out that exists in 100s of folders in the current directory find ./ -name 'ss3.out' -exec rm {} \; The words following the -exec option is the command that you want to execute i.e. rm in this case. {\}\ is basically an indicator that the filenames returned by the search should be substituted here. \; is the terminating string, and is required at the end of the command. Remove files starting with "mm" that exists in 100s of folders in the current directory find ./ -name mm* -exec rm {} \; Before doing the above, you might look to see if it is going to remove the files you want! find ./ -name mm* > look ====== List all files modified in the current directory (./) in last 9 days: ====== find ./ -mtime -9d -exec ls -lt {} \; Same thing, but only R scripts modified in last 9 days: find ./ -mtime -9d -name ".R" -exec ls -lt {} \; ====== Rename multiple files in a folder ====== Use the rename function. A perl expression must come first. E.g., to change all .txt files to .bak: rename 's/\.txt/.bak/' *.txt To rename all files beginning with "NEW." and change them to "OLD.": rename 's/^NEW./OLD./' NEW.* ====== Copy a file over the network (see also rsync) ====== scp files.to.copy user@server.colorado.edu:~/myfolder/subfolder ====== Copy entire directory over the network (see also rsync) ====== scp -r directory.to.copy user@server.colorado.edu:~/myfolder/subfolder ====== Change default permissions for the group ====== Add this line to your ~/.bashrc file:\\ umask 002 #let group have read/write/execute permissions ====== Compressing files ====== A single file, converting the file to a compressed file gzip filename.ext A single file, leaving the original file unchanged gzip -c filename.ext > new.zipped.file.gz Multiple files gzip -c filename.ext filename2.ext > new.zipped.file.gz Extract a gzipped file gunzip new.zipped.file.gz A directory tar -cvzf tar.file.name.tar directory Extract a tar'd directory tar -xvf tar.file.name.tar Extract a tar.gz directory tar -zxvf tar.file.name.tar.gz Extract MULTIPLE tar.gz directories for i in *.tar.gz; do tar -zxvf "$i"; done ====== Running multiple jobs at once in a shell script ====== ./S800-loop1.sh & ./S800-loop2.sh & ./S800-loop3.sh & ====== Pausing for a given amount of time between starting processes in a shell script ====== ./S800-loop1.sh & sleep 45m ./S800-loop2.sh & sleep 45m ./S800-loop3.sh & sleep 45m ====== Look at all files that have changed in last xx days ====== find /directory -type f -ctime -xx | more ====== Look at multiple files at same time ====== less file1 file2 file3 :n forward to next file :p backward to previous file ====== Refresh .bashrc or .bash_profile files ====== After you have modified one of the files above, you need to refresh your OS so it uses the correct .bashrc or .bash_profile settings. You can either restart the computer, logout and log back in, or do this: source ~/.bash_profile #assuming that you've modified .bash_profile ====== Download files using command line ====== lwp-download http://pngu.mgh.harvard.edu/~purcell/plink/dist/plink-1.07-x86_64.zip OR wget http://pngu.mgh.harvard.edu/~purcell/plink/dist/plink-1.07-x86_64.zip ====== Copy (or move) files that have changed within the last 5 days ====== find ./ -mtime -5 -exec cp {} ~new/path/folder \; Make sure that the target folder isn't in the folder being found; i.e., that ~/new/path/folder isn't in ./. Otherwise, you'll start trying to copy the contents of the folder itself back into the folder. ====== Get basic information about the computer or node that you are on ====== cat /proc/cpuinfo cat /proc/cpuinfo | grep processor cat /proc/cpuinfo | grep processor | wc