November 2018
M T W T F S S
« Oct    
 1234
567891011
12131415161718
19202122232425
2627282930  

Categories

WordPress Quotes

A real decision is measured by the fact that you've taken a new action. If there's no action, you haven't truly decided.
Tony Robbins

Recent Comments

November 2018
M T W T F S S
« Oct    
 1234
567891011
12131415161718
19202122232425
2627282930  

Short Cuts

2012 SERVER (64)
2016 windows (9)
AIX (13)
Amazon (34)
Ansibile (18)
Apache (133)
Asterisk (2)
cassandra (2)
Centos (209)
Centos RHEL 7 (258)
chef (3)
cloud (2)
cluster (3)
Coherence (1)
DB2 (5)
DISK (25)
DNS (9)
Docker (28)
Eassy (11)
ELKS (1)
EXCHANGE (3)
Fedora (6)
ftp (5)
GIT (3)
GOD (2)
Grub (1)
Hacking (10)
Hadoop (6)
horoscope (23)
Hyper-V (10)
IIS (15)
IPTABLES (15)
JAVA (7)
JBOSS (32)
jenkins (1)
Kubernetes (2)
Ldap (5)
Linux (189)
Linux Commands (167)
Load balancer (5)
mariadb (14)
Mongodb (4)
MQ Server (22)
MYSQL (84)
Nagios (5)
NaturalOil (13)
Nginx (30)
Ngix (1)
openldap (1)
Openstack (6)
Oracle (34)
Perl (3)
Postfix (19)
Postgresql (1)
PowerShell (2)
Python (3)
qmail (36)
Redis (12)
RHCE (28)
SCALEIO (1)
Security on Centos (29)
SFTP (1)
Shell (64)
Solaris (58)
Sql Server 2012 (4)
squid (3)
SSH (10)
SSL (14)
Storage (1)
swap (3)
TIPS on Linux (28)
tomcat (60)
Uncategorized (29)
Veritas (2)
vfabric (1)
VMware (28)
Weblogic (38)
Websphere (71)
Windows (19)
Windows Software (2)
wordpress (1)
ZIMBRA (17)

WP Cumulus Flash tag cloud by Roy Tanck requires Flash Player 9 or better.

Who's Online

27 visitors online now
11 guests, 16 bots, 0 members

Hit Counter provided by dental implants orange county

Detailed Docker container common operations

First, start the container

There are two ways to start a container. One is to create a new container based on the image and start, and the other is to restart the container in the terminated state. 
Because Docker’s containers are too lightweight, users often delete and recreate containers at any time.

New and start

For example, the following command outputs a “Hello World” and then terminates the container.

$ docker run Ubuntu :14.04 /bin/echo ‘Hello world’ 
Hello world

This is almost indistinguishable from directly executing /bin/echo ‘hello world’ locally.

The following command launches a bash terminal that allows the user to interact.

$ docker run -t -i ubuntu:14.04 /bin/bash 
root@af8bae53bdd3:/#

The -t option causes Docker to assign a pseudo-tty and bind to the container’s standard input, and -i keeps the container’s standard input open.

When using docker run to create containers, the standard operations that Docker runs in the background include:

  • Check if the specified image exists locally. If it does not exist, download it from the public repository.
  • Create and launch a container with an image
  • Allocate a file system and mount a readable and writable layer outside the read-only mirror layer
  • Bridge a virtual interface into the container from the bridge interface configured by the host host
  • Configure an ip address from the address pool to the container
  • Execute a user-specified application
  • The container is terminated after execution

Starting a terminated container 
You can use the docker container start command to start a container that has been terminated.

Second, the guardian state runs

More often, you need to have Docker run in the background instead of directly outputting the results of the execution command under the current host. This can be done by adding the -d parameter.

$ docker run -d ubuntu /bin/sh -c “while true; do echo hello world; sleep 1; done” 
cb30b87566d0550ec5f1232d148c5ffed6546c347889e58a6405579f2af73f2a

A unique id is returned when started with the -d parameter. The output can be viewed with docker logs [container ID or NAMES]. If you do not use the -d parameter. The output result (STDOUT) will be printed on the host

View container information with the docker container ls command.

$ docker container ls 
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 
cb30b87566d0 ubuntu “/bin/sh -c ‘while t…” 2 minutes ago Up 2 minutes goofy_mcclintock

To get the output of the container, you can use the docker container logs command.

$ docker container logs goofy_mcclintock 
hello world 
hello world 
hello world 
……

Note: Whether the container will run for a long time is related to the command specified by docker run, regardless of the -d parameter.

Third, terminate the container

You can use the docker container stop to terminate a running container. The format is: 
docker container stop [options] CONTAINER [CONTAINER…]

In addition, when the application specified in the Docker container is terminated, the container is also automatically terminated. 
For example, only the container of one terminal is started. When the user exits the terminal through the exit command or Ctrl+d 
, the created container is terminated immediately.

The container for the terminated state can be seen with the docker container ls -a command. E.g

$ docker container stop goofy_mcclintock 
goofy_mcclintock

$ docker container ls -a 
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 
cb30b87566d0 ubuntu “/bin/sh -c ‘while t…” 20 minutes ago Exited (137) 23 seconds ago goofy_mcclintock

The container in the terminated state is started by the docker container start command; the 
container of a running state is terminated and restarted by the docker container restart command.

Fourth, enter the container

When the -d parameter is used, the container will enter the background after it starts. 
Use the docker attach command or the docker exec command to enter the container. It is recommended to use the docker exec command for reasons explained below.

The attach command 
docker attach is a command that comes with Docker. The following example shows how to use this command.

$ docker run -dit ubuntu 
e1ffd4f792fe0ce7f7e700147051e1f792e352f5b70929eb9376393ac20114b4

$ docker container ls 
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 
e1ffd4f792fe ubuntu “/bin/bash” About a minute ago Up About a minute awesome_payne

$ docker attach e1ff 
root@e1ffd4f792fe:/#

Note: If exit from this stdin, it will cause the container to stop.

The exec command 
-i -t parameter 
docker exec can be followed by multiple parameters, here mainly the -i -t parameter. 
When only the -i parameter, since there is no allocation of pseudo-terminals, the interface is not familiar Linux command prompt, the command execution 
line results can still be returned. 
When the -i -t parameter is used together, you can see the Linux command prompt we are familiar with.

$ docker run -dit ubuntu 
16168d4b66b115b5afac5836db3ff93304774e98489f628ac625fff2bcd640ba

$ docker container ls 
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 
16168d4b66b1 ubuntu “/bin/bash” 58 seconds ago Up 57 seconds happy_bardeen

$ docker exec -it 16168 bash 
root@16168d4b66b1:/#

Exit from this stdin will not cause the container to stop. That’s why the docker exec is recommended. 
For more parameter descriptions, please use docker exec –help to view.

Fifth, delete the container

Delete a container that is in a terminated state in the format: 
docker container rm [options] CONTAINER [CONTAINER…]

$ docker container rm awesome_payne 
awesome_payne

If you want to delete a running container, you can add the -f parameter. Docker will send a SIGKILL signal to the container.

Clean up all containers in the terminated state. Use the docker container ls -a command to view all the containers that have been created, including the termination status. If the number is too large, it may be cumbersome to delete them one by one. You can use the following command to clear all the termination status. Container.

Prune Container Docker $ 
! This by Will the Remove All the WARNING stopped Containers. 
Are you the Sure you want to the Continue [the y-/ N] the y-? 
Deleted Containers: 
545f8f6d19286efae28307d06ed1acc034d07f109e907c01892471a6f89e772d 
cb30b87566d0550ec5f1232d148c5ffed6546c347889e58a6405579f2af73f2a 
……

Export and import containers

Exporting a container 
If you want to export a local container, you can use the docker export command.

Container Docker LS -a $ 
CONTAINER ID PORTS the STATUS the IMAGE CREATED the COMMAND NAMES 
16168d4b66b1 Ubuntu “/ bin / the bash” 18 is happy_bardeen minutes minutes ago Member 18 is Up

$ docker export 16168d4b66b1 > ubuntu.tar

This will export the container snapshot to a local file.

Import container snapshots 
can be imported as mirrors from the container snapshot file using docker import, for example

$ cat ubuntu.tar | docker import – test/ubuntu:v1.0 
sha256:91b174fec9ed55d7ebc3d2556499713705f40713458e8594efa114f261d7369a

$ docker image ls 
REPOSITORY TAG IMAGE ID CREATED SIZE 
test/ubuntu v1.0 91b174fec9ed 10 seconds ago 69.8MB 
ubuntu latest 735f80812f90 3 weeks ago 83.5MB

Alternatively, you can import it by specifying a URL or a directory, such as 
$ docker import http://example.com/exampleimage.tgz example/imagerepo

Note: Users can either use the docker load to import the image storage file to the local image library, or use docker import to import a container snapshot to the local image library. The difference between the two is that the container snapshot file will discard all history and metadata information (that is, only the snapshot state of the container at the time), and the image storage file will save the full record and be large. In addition, metadata information such as tags can be reassigned when importing from a container snapshot file.

EC2 volume size incorrect

Expanding the Storage Space of an EBS Volume on Linux

[root@data ~]# df -TH
Filesystem Type Size Used Avail Use% Mounted on
/dev/xvda1 xfs 859G 818G 42G 96% /
devtmpfs devtmpfs 4.1G 0 4.1G 0% /dev
tmpfs tmpfs 4.2G 0 4.2G 0% /dev/shm
tmpfs tmpfs 4.2G 26M 4.1G 1% /run
tmpfs tmpfs 4.2G 0 4.2G 0% /sys/fs/cgroup
/dev/xvdg ext4 212G 201G 4.1k 100% /databackup
/dev/mapper/vg_newlvm-centos7_newvol ext4 106G 20G 81G 20% /data1
tmpfs tmpfs 821M 0 821M 0% /run/user/1004
tmpfs tmpfs 821M 0 821M 0% /run/user/0

[root@data ~]# resize2fs /dev/xvdg
resize2fs 1.42.9 (28-Dec-2013)
Filesystem at /dev/xvdg is mounted on /databackup; on-line resizing required
old_desc_blocks = 25, new_desc_blocks = 50
The filesystem on /dev/xvdg is now 104857600 blocks long.

[root@data ~]# df -TH
Filesystem Type Size Used Avail Use% Mounted on
/dev/xvda1 xfs 859G 818G 42G 96% /
devtmpfs devtmpfs 4.1G 0 4.1G 0% /dev
tmpfs tmpfs 4.2G 0 4.2G 0% /dev/shm
tmpfs tmpfs 4.2G 26M 4.1G 1% /run
tmpfs tmpfs 4.2G 0 4.2G 0% /sys/fs/cgroup
/dev/xvdg ext4 423G 201G 203G 50% /databackup
/dev/mapper/vg_newlvm-centos7_newvol ext4 106G 20G 81G 20% /data1
tmpfs tmpfs 821M 0 821M 0% /run/user/1004
tmpfs tmpfs 821M 0 821M 0% /run/user/0
[root@data ~]#

rsync increment data

command will copy increment data and keep it in sync with remote server.

  1. It will copy only incremental data.
  2. It will delete if any data deleted from source.
  3. It will copy again from source if any data deleted at destination.
  4. basically this command will keep the both environment in sync.

rsync -avWe ssh --delete-before (source) root@localhost:(destination)
rsync -avW --delete-before -e ssh (source) root@localhost:(destination)

Example:

rsync -avWe ssh --delete-before /data root@192.168.1.4:/data
rsync -avW --delete-before -e ssh /data root@192.168.1.4:/data


To delete files in the target, add the --delete option to your command. For example:

rsync -avh source/ dest/ --delete

Apache Configure CORS Headers for Whitelist Domains

Apache Configure CORS Headers for Whitelist Domains

 

 

In the current implementation of Cross Origin Resource Sharing (CORS) the Access-Control-Allow-Origin header can only provide a single host domain or a wildcard as the accept value. This is not optimal when you have multiple clients connecting to the same virtual server and simply want to allow a list of known client host domains to the “allow” list.

Since only a single domain in a single access header can be delivered back to the client, Apache must read the incoming Origin header and match it to the list of “white” (accepted) domains. If an appropriate match is found, echo the domain host back to client as the value of Access-Control-Allow-Origin.

Use the following configuration snippet in the Apache virtual host “.conf” file or in the server “.htaccess” file. Ensure mod_headers and SetEnvIfNoCase are enabled.

<IfModule mod_headers.c>
   SetEnvIfNoCase Origin "https?://(www\.)?(domain\.com|staging\.domain\.com)(:\d+)?$" ACAO=$0
   Header set Access-Control-Allow-Origin %{ACAO}e env=ACAO
</IfModule>

The regular expression https?://(www\.)?(domain\.com|staging\.domain\.com)(:\d+)?$ matches the URL of Origin, a required HTTP header for all requests. The pattern matches both the http and https protocols. It will match an optional www. subdomain and finally matches the actual host name of your whitelist entries. Any characters after the domain name are ignored. This example will therefore enable:

* http://domain.com
* https://domain.com
* http://www.domain.com
* https://www.domain.com
* http://staging.domain.com
* https://staging.domain.com
* http://www.staging.domain.com
* https://www.staging.domain.com

If you send a request from http://staging.domain.com/app/, the response would include the header:

Access-Control-Allow-Origin: http://staging.domain.com

If you sent another request from https://www.domain.com/client/, the response would include the header:

Access-Control-Allow-Origin: https://www.domain.com

Linux: include hidden files in tar archive

When you create a tar archive of a directory tree the hidden files are normally not included. Here’s how to include the hidden files.

Say you have a web directory called “/var/www/html/mysite/” that contains the following tree:

.htaccess
index.php
logo.jpg
style.css
admin_dir/.htaccess
admin_dir/includes.php
admin_dir/index.php

Normally you would use the tar command like this:

tar czf mysite.tar.gz /var/www/html/mysite/*

This way all files are tarred except the .htaccess files.

The solution is actually quite simple: replace the asterisk (*) by a dot (.):

tar czf mysite.tar.gz /var/www/html/mysite/.

This way the .htaccess files are included in the tarfile.

Reset your WordPress password via email

Reset your WordPress password via email

If you’ve forgotten your WordPress admin password, you can reset it via email from the WordPress dashboard login page following these steps:

  1. Go to your WordPress login page (example.com/wp-admin)
  2. Click on Lost you password? at the bottom
  3. Enter the Username or E-mail of your WordPress admin user, then click on Get New Password
  4. You should get an email with the subject [WordPress Site] Password Reset. The body of this email will contain a link below the text To reset your password, visit the following address, go ahead and click on that link.
  5. Type in your New password, confirm it, then click on Reset Password

 

WordPress makes it super easy to reset your password. You can simply go to the login screen and click on the ‘Lost your password’ link.

ow you can easily reset a WordPress password from phpMyAdmin.

Server: localhost »Database: wordpress »Table: wp__users

change the user name
change the user email id

Adding SSL to Apache on EC2 with Amazon Linux

Adding SSL to Apache on EC2 with Amazon Linux

 

These notes assume you have Apache installed and working on EC2 with Amazon Linux, but it’s fairly similar for other versions of Linux.

Install OpenSSL and the Apache Connector

// for Apache 2.2
yum install openssl mod_ssl
// for Apache 2.4
yum install openssl mod24_ssl
// restart Apache
service httpd restart

yum install openssl mod24_ssl httpd24-tools httpd24

Test SSL

https://yoursever.com/

This will bring up the default key that was create when you installed OpenSSL.

Generate Key

cd  /etc/pki/tls/private
openssl genrsa -out domain-name.key 2048
chown root.root domain-name.key
chmod 600 domain-name.key

Generate Request

mkdir ssl under /ec2-user/domain-name/ssl
cd /ec2-user/domain-name/ssl
sudo openssl req -new -key /etc/pki/tls/private/domain-name.key -out domain-name.pem

Once the request has been generated and sent to your certificate authority they will send you back two .crt files. One is the domain cert and one is the bundle cert. You can rename them to domain-name.crt and domain-name-bundle.crt.

// put crt file on the server with correct permissions
cp domain-name.crt /etc/pki/tls/certs/domain-name.crt
chown root.root /etc/pki/tls/certs/domain-name.crt
chmod 600 /etc/pki/tls/certs/domain-name.crt
  
cp domain-name-bundle.crt /etc/pki/tls/certs/domain-name-bundle.crt
chown root.root /etc/pki/tls/certs/domain-name-bundle.crt
chmod 600 /etc/pki/tls/certs/domain-name-bundle.crt
  

It’s important to change the permissions on the file for Apache and OpenSSL will not work.

Configure Apache SSL

// backup the conf file
cp /etc/httpd/conf.d/ssl.conf /etc/httpd/conf.d/ssl.conf.bkp
// edit the file
nano /etc/httpd/conf.d/ssl.conf


// search for the .key file line below and change the localhost.key

SSLCertificateKeyFile /etc/pki/tls/private/domain-name.key

// search for the .crt file line below and change the localhost.crt

SSLCertificateFile /etc/pki/tls/certs/domain-name.crt

// search for the bundle.crt file line below and point to the new bundle.crt

SSLCACertificateFile /etc/pki/tls/certs/domain-name-bundle.crt

// restart Apache
service httpd restart

This allows one SSL Domain on the server. If you want to have more than one SSL domain on the server it’s a bit more setup. I’ll cover that in a different post.

How to Improve rsync Performance

I need to transfer 10TB of data from one machine to another machine. Those 10TB of files are living in a large RAID which span across 7 different disks. The target machine has another large RAID which span across 12 different disks. It is not easy to copying those files locally. Therefore, I decide to copy the files over the LAN.

There are four options popping up in my head: scprsyncrsyncd (rsync as daemon) and netcat.

scp

scp is handy, easy to use but comes with two disadvantages: slow and not fault-tolerant. Since scp comes with the highest security, all data are encrypted before the transfer. It will slow down the overall performance because of the extra encryption stuffs (which makes the data larger), and extra computational resource (which uses more CPU). If the transfer is interrupted, there is no easy way to resume the process other than transferring everything again. Here are some example commands:

#Source machine
#Typical speed is about 20 to 30MB/s
scp -r /data target_machine:/data

#Or you can enable the compression on the fly
#Depending on the type of your data, if your data is already compressed, you may see no or negative speed improvement
scp -rC /data target_machine:/data

rsync

rsync is similar to scp. It comes with the encryption (via SSH) such that the data is safe. It also allows you to transfer the newer files only. This will reduce the amount of data being transferred. However, it comes with few disadvantages: long decision time, encryption (which increase the size of overhead) and extra computational resource(e.g., data comparison, encryption and decryption etc). For example, if I use rsync to transfer 10TB of files from one machine to another machine (where the directory on the target machine is blank), it can easily take 5 hours to determine which files will need to be transferred before the actual data transfer is initialized.

#Run on the target machine
rsync -avzr -e ssh --delete-after source_machine:/data/ /data/

#Use a less secure encryption algorithm to speed up the process
rsync -avzr --rsh="ssh -c blowfish" --delete-after source_machine:/data/ /data/

#Use an even less secure algorithm to get the top speed
rsync -avzr --rsh="ssh -c arcfour" --delete-after source_machine:/data/ /data/

#By default, rsync compares the files using checksum, file size and modification date.
#Reduce the decision process by skipping the hash check
rsync -avzr --rsh="ssh -c arcfour" --delete-after --whole-file source_machine:/data/ /data/

Anyway, no matter what you do, the top speed of rsync in a consumer-grade gigabit network is around 45MB/s. On average, the speed is around 25-35MB/s. Keep in mind that this number does not include the decision time, which can be few hours.

rsyncd (rsync as a daemon)

Thanks for the comment of our reader. I got a chance to investigate the rsync as a daemon. Basically, the idea of running rsync as a daemon is similar to rsync. On the server, we run rsync as a service/daemon. We specify which directory we want to “export” to the clients (e.g., /usr/ports). When the files get changed on the server, it records the changes so that the when the clients talk to the server, the decision time will be faster. Here is how to set up rsync server on FreeBSD

sudo nano /usr/local/etc/rsyncd.conf

And this is my configuration file:

pid file = /var/run/rsyncd.pid

#Notice that I use derrick here instead of other systems users, such as nobody
#That's because nobody does not have permission to access the path, i.e., /data/
#Either you make the source directory available to "nobody", or you change the daemon user.
uid = derrick
gid = derrick
use chroot = no
max connections = 4
syslog facility = local5
pid file = /var/run/rsyncd.pid

[mydata]
   path = /data/
   comment = data
Don't forget to include the following in /etc/rc.conf, so that the service will be started automatically.

rsyncd_enable="YES"
#Let's start the rsync service:

sudo /usr/local/etc/rc.d/rsyncd start

To pull the files from the server to the clients, run the following:

rsync -av myserver::mydata /data/

#Or you can enable compression
rsync -avz myserver::mydata /data/

To my surprise, it works much better than running rsync alone. Here are some data I collected during transferring 10TB files from ZFS to ZFS:

Bandwidth measured on the client machine: 70MB/s

zpool IO speed on the client side: 75MB/s

P.S. Initially, the speed was about 45-60MB/s, after I tweak my Zpool, I can get the top speed to 75-80MB/s. Please check out here for references.

I notice that the decision time is much faster than running rsync alone. Also the process is much more stable, with zero interruption, i.e.,

rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at io.c(521) [receiver=3.1.0]
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(632) [generator=3.1.0]
rsync: [receiver] write error: Broken pipe (32)

NetCat

NetCat is similar to cat, except that it works at the network level. I decide to use netcat for the initial transfer. If it is interrupted, I will let rsync to kick in the process. Netcat does not encrypt the data, so the overhead is very small. If you transfer the file within a local network and you don’t care about the security, netcat is a perfect choice.

There is only one disadvantage of using netcat. It can only handle one file at a time. It doesn’t mean you need to run netcat for every single file. Instead, we can tar the file before feeding to netcat, and untar the file at the receiving end. As long as we do not compress the files, we can keep the CPU usage small.

#Open two terminals, one for the source and another one for the target machine.

#On the target machine:
#Go to the directory, e.g., 
cd /data

#Run the following:
nc -l 9999| tar xvfp -

#On the source machine:
#Go to the directory, e.g.,
cd /data

#Pick a port number that is not being used, e.g., 9999
tar -cf - . | nc target_machine 9999

Unlike rsync, the process will start right the way, and the maximum speed is around 45 to 60MB/s in a gigabit network.

Conclusion

Candidates Top Speed (w/o compression) Top Speed (w/ compression) Resume Stability Instant Start?
scp 40MB/s 25MB/s No Low Instant
rsync 25MB/s 50MB/s Yes Medium Long Preparation
rsyncd 30MB/s 70MB/s Yes High Short Preparation
netcat 60MB/s (tar w/o -z) 40MB/s (tar w/ -z) No Very High Instant

 

Choice
Command
Pros
Cons
#1
scp
  • can be speed up choosing simple encryption
  • can recursively copy directories
#2
rsync
flexible and convenient for directory synchronization
possible, but not easy to configure to NOT to use encryption
#3
sftp
can be speed up choosing simple encryption
can’t recursively copy directories

Notice: when running any tool it consumes about 5-10% of CPU at both sender and receiver machines, apparently, doing encryption/decryption.

TEST DETAILS

These are the actual commands and generated output by these tools.

rsync

default “-a” –archive mode
“-z” compress
rsync -a –progress DIR2REPLICATE root@10.10.10.2:/tmp
   411533312   0%   45.38MB/s    0:26:04
  1407877120   1%   44.41MB/s    0:26:16
  1716748288   2%   42.09MB/s    0:27:36
  2002550784   2%   46.47MB/s    0:24:541
  2382397440   3%   45.31MB/s    0:25:24
  2762407936   3%   45.34MB/s    0:25:15
 rsync -az –progress DIR2REPLICATE root@10.10.10.2:/tmp
991383915 100%   13.67MB/s    0:01:09
   990955265 100%   14.02MB/s    0:01:07
   202624740 100%   15.42MB/s    0:00:12
   202771784 100%   15.87MB/s    0:00:12
    91676674 100%   12.86MB/s    0:00:06
    91628045 100%   11.76MB/s    0:00:07
  1082301721 100%   16.86MB/s    0:01:01
  1081744094 100%   17.14MB/s    0:01:00
   444531263 100%   13.06MB/s    0:00:32
   444311917 100%   12.97MB/s    0:00:32
    25956199 100%   11.99MB/s    0:00:02
    25387962 100%   16.94MB/s    0:00:01
    94059363 100%   15.51MB/s    0:00:05
    94189273 100%   14.61MB/s    0:00:06
   369550738 100%   16.31MB/s    0:00:21
   370924791 100%   15.96MB/s    0:00:22
   143659839 100%   14.75MB/s    0:00:09
   141681760 100%   14.58MB/s    0:00:09
    74662680 100%   14.45MB/s    0:00:04
    73882769 100%   12.73MB/s    0:00:05
     1809543 100%   13.59MB/s    0:00:00
###
### “-c arcfour” cipher is defined in RFC 4253; it is plain RC4 with a 128-bit key
###
rsync -a -P -e “ssh -T -c arcfour -o Compression=no -x” DIR2REPLICATE root@10.10.10.2:/tmp
  1081744094 100%   65.35MB/s    0:00:15
444531263 100%   56.34MB/s    0:00:07
444311917 100%   61.61MB/s    0:00:06
369550738 100%   53.94MB/s    0:00:06
370924791 100%   60.03MB/s    0:00:05
23319017231 100%   65.89MB/s    0:05:37
23308793162 100%   64.88MB/s    0:05:42
11951287020 100%   65.68MB/s    0:02:53
3453648896  28%   68.11MB/s    0:02:0

scp

default “-r” recursive
“-C” compress “-r” recursive
scp -r DIR2REPLICATE root@10.10.10.2:/tmp
100%  193MB  64.4MB/s   00:03
100%  424MB  60.6MB/s   00:07
100%  945MB  63.0MB/s   00:15
100%  945MB  59.1MB/s   00:16
100% 1032MB  64.5MB/s   00:16
100% 1032MB  60.7MB/s   00:17
100%  749MB  53.5MB/s   00:14
100% 1253MB  62.6MB/s   00:20
18% 4615MB  62.6MB/s   05:18
scp -Cr DIR2REPLICATE root@10.10.10.2:/tmp
100%  193MB  16.1MB/s   00:12
100%  424MB  14.6MB/s   00:29
100%  945MB  15.0MB/s   01:03
100%  945MB  14.8MB/s   01:04
100%  424MB  14.1MB/s   00:30
100%  352MB  17.6MB/s   00:20
100%  193MB  17.6MB/s   00:11
100%  135MB  16.9MB/s   00:08
100% 1032MB  17.8MB/s   00:58
100% 1032MB  17.8MB/s   00:58
100%  354MB  17.7MB/s   00:20
100%  749MB  18.3MB/s   00:41
100% 1253MB  18.4MB/s   01:08
  6% 1518MB  17.7MB/s   21:43
“-c arcfour” cipher is defined in RFC 4253; it is plain RC4 with a 128-bit key.
scp -c arcfour -r DIR2REPLICATE root@10.10.10.2:/tmp
100%  424MB 141.3MB/s   00:03
100%  945MB 135.0MB/s   00:07
100%  945MB 189.1MB/s   00:05
100%  424MB 141.2MB/s   00:03
100%  352MB 117.5MB/s   00:03
100% 1032MB 147.4MB/s   00:07
100% 1032MB 147.5MB/s   00:07
100%  749MB 149.8MB/s   00:05
100% 1253MB 156.6MB/s   00:08
100%   24GB 142.0MB/s   02:53
100%  595MB 119.1MB/s   00:05
100%   82GB 138.3MB/s   10:09
51% 9099MB 141.3MB/s   01:01
sftp
default behavior
“-R” to increase request queue length (default is 64)
“-B” to increase read/write request size (default is 32 KB)
sftp  root@10.10.10.2:/tmp
10% 2363MB  57.8MB/s   05:43 ETA
15% 3349MB  58.1MB/s   05:25 ETA
32% 7311MB  59.3MB/s   04:11 ETA
35% 7803MB  60.6MB/s   03:58 ETA
43% 9594MB  62.1MB/s   03:23 ETA
69%   15GB  58.6MB/s   01:55 ETA
77%   17GB  62.1MB/s   01:20 ETA
sftp  -R 128 -B 65536 root@10.10.10.2:/tmp
  2%  551MB  58.9MB/s   06:08 ETA
  8% 1806MB  62.3MB/s   05:28 ETA
41% 9170MB  60.6MB/s   03:35 ETA
56%   12GB  62.6MB/s   02:32 ETA
100%   22GB  62.5MB/s   05:56
“-c arcfour” cipher is defined in RFC 4253; it is plain RC4 with a 128-bit key.
sftp -oCiphers=arcfour root@10.10.10.2:/tmp
3%  711MB 142.5MB/s   02:31 ETA
18% 4115MB 146.0MB/s   02:04 ETA
23% 5156MB 148.1MB/s   01:55 ETA
28% 6379MB 144.6MB/s   01:49 ETA
34% 7672MB 144.0MB/s   01:41 ETA
37% 8389MB 143.7MB/s   01:36 ETA
62%   14GB 143.8MB/s   00:58 ETA
85%   19GB 142.4MB/s   00:22 ETA
92%   20GB 142.3MB/s   00:12 ETA
100%   22GB 144.4MB/s   02:34

TEST ENVIRONMENT

The test was performed between two servers interconnected by private 10 Gbit link with 9000 MTU “jumbo frame“. The files copies were large (100’s GB) binary files.

iperf network bandwidth test between 10.10.10.2 and 10.10.10.1
Network interface configuration (10 Gbit, MTU 9000)
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
————————————————————
[  4] local 10.10.10.2 port 5001 connected with 10.10.10.1 port 57279
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  11.5 GBytes  9.89 Gbits/sec
[  5]  0.0-30.0 sec  34.6 GBytes  9.89 Gbits/sec
[  4]  0.0- 0.9 sec  1000 MBytes  9.80 Gbits/sec
[  5]  0.0- 8.8 sec  9.77 GBytes  9.53 Gbits/sec
[  4]  0.0- 8.7 sec  10.0 GBytes  9.89 Gbits/sec
[  5]  0.0-86.8 sec   100 GBytes  9.89 Gbits/sec
[root@10.10.10.2]# cat /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=”bond0″
BOOTPROTO=none
IPADDR=10.10.10.2
NETMASK=255.255.255.192
ONBOOT=”yes”
USERCTL=no
TXQUEUELEN=100000
MTU=8912
BONDING_OPTS=”mode=1 miimon=200 primary=eth0″

 

ALTERNATIVE METHODS TO MOVE DATA FAST

Copy directories via netcat: tar | nc. Renders speed  ~251 Mb/s ( = ~1 TB/hr).

### On receiver ###
nc -v -l 5555  | tar -xvf –

### On Sender: test2del – large directory to move ###
time tar -cvf – test2del | nc -v 10.100.100.2 5555

### Output calculated ###
11GB in  27.465 s = 293 MB/s
42GB in 2m51.513s = 249 Mb/s (~1 TB/hr)
42GB in 2m50.630s = 251 Mb/s (~1 TB/hr)

Replacing IP Address in Apache2 config files with SED

Suppose i just mirrored my vps machine (starting from a clone and then rsync-ing all needed files) with rsync. Obviously i need to change the IP Address value contained into all the config files, but I’m lazy.
So, let’s use “SED” to do it at once, with a single line command.
I need to replace the IP Address “192.168.100.5” with “192.168.100.4” in all files contained in /etc/apache2/*

Our command for one file should be:

$ sed -i 's/192.168.100.5/192.168.100.4/g' /etc/apache2/sites-available/default

We want to do it on a bounce of files that contain that string, but unfortunately SED can’t accept wildcard chars so we need run it through a loop.
For this purpose we can use the linux FIND utility so we will end up with sed within this loop. The command should look like this:

$ find /etc/apache2/sites-available/ -type f -exec sed -i 's/192\.168\.100\.4/192\.168\.100\.5/g' {} \;

And while we are here, let’s say that if you manage to obtain the backup machine from a clone of the production machine and you are mantaining the filesystem in sync, this is what you should do with relevant files once you have finished to sync.

 

sed -i -r ‘s/192.168.1.35$/192.168.1.14/g’

find . -type f -exec sed -i ‘s/192\.168\.1\.35/192\.168\.1\.14/g’ {} \;

rsync

There are many commands to copy a directory in Linux. The difference between them in current Linux distribution are very small. All of them support link, time, ownership and sparse.

I tested them to copy a Linux kernel source tree. Each command I tested twice and keep the lower result.
The original directory size is 639660032 bytes. All methods generate exact same size of 675446784 bytes without sparse option.

Non Sparse Sparse
rsync rsync -a src /tmp rsync -a -S src /tmp
cpio find src -depth|cpio -pdm /tmp find src -depth|cpio -pdm –sparse /tmp
cp cp -a –sparse=never src /tmp cp -a –sparse=always src /tmp
tar tar -c src|tar -x -C /tmp tar -c -S src|tar -x -C /tmp

SCP: Secure Copy

Secure Copy is just like the cp command, but secure. More importantly, it has the ability to send files to remote servers via SSH!

Copy a file to a remote server:

# Copy a file:
$ scp /path/to/source/file.ext username@hostname.com:/path/to/destination/file.ext

# Copy a directory:
$ scp -r /path/to/source/dir username@server-host.com:/path/to/destination

This will attempt to connect to hostname.com as user username. It will ask you for a password if there’s no SSH key setup (or if you don’t have a password-less SSH key setup between the two computers). If the connection is authenticated, the file will be copied to the remote server.

Since this works just like SSH (using SSH, in fact), we can add flags normally used with the SSH command as well. For example, you can add the -v and/or -vvv to get various levels of verbosity in output about the connection attempt and file transfer.

You can also use the -i (identity file) flag to specify an SSH identity file to use:

$ scp -i ~/.ssh/some_identity.pem /path/to/source/file.ext username@hostname:/path/to/destination/file.ext

Here are some other useful flags:

  • -p (lowercase) – Preserves modification times, access times, and modes from the original file
  • -P – Choose an alternate port
  • -c (lowercase) – Choose another cypher other than the default AES-128 for encryption
  • -C – Compress files before copying, for faster upload speeds (already compressed files are not compressed further)
  • -l – Limit bandwidth used in kiltobits per second (8 bits to a byte!).
    • e.g. Limit to 50 KB/s: scp -l 400 ~/file.ext user@host.com:~/file.ext
  • -q – Quiet output

Rsync: Sync Files Across Hosts

Rsync is another secure way to transfer files. Rsync has the ability to detect file differences, giving it the opportunity to save bandwidth and time when transfering files.

Just like scp, rsync can use SSH to connect to remote hosts and send/receive files from them. The same (mostly) rules and SSH-related flags apply for rsync as well.

Copy files to a remote server:

# Copy a file
$ rsync /path/to/source/file.ext username@hostname.com:/path/to/destination/file.ext

# Copy a directory:
$ rsync -r /path/to/source/dir username@hostname.com:/path/to/destination/dir

To use a specific SSH identity file and/or SSH port, we need to do a little more work. We’ll use the -e flag, which lets us choose/modify the remote shell program used to send files.

# Send files over SSH on port 8888 using a specific identity file:
$ rsync -e 'ssh -p 8888 -i /home/username/.ssh/some_identity.pem' /source/file.ext username@hostname:/destination/file.ext

Here are some other common flags to use:

  • -v – Verbose output
  • -z – Compress files
  • -c – Compare files based on checksum instead of mod-time (create/modified timestamp) and size
  • -r – Recursive
  • -S – Handle sparse files efficiently
  • Symlinks:
    • -l – Copy symlinks as symlinks
    • -L – Transform symlink into referent file/dir (copy the actual file)
  • -p – Preserve permissions
  • -h – Output numbers in a human-readable format
  • --exclude="" – Files to exclude
    • e.g. Exclude the .git directory: --exclude=".git"

There are many other options as well – you can do a LOT with rsync!

Do a Dry-Run:

I often do a dry-run of rsync to preview what files will be copied over. This is useful for making sure your flags are correct and you won’t overwrite files you don’t wish to:

For this, we can use the -n or --dry-run flag:

# Copy the current directory
$ rsync -vzcrSLhp --dry-run ./ username@hostname.com:/var/www/some-site.com
#> building file list ... done
#> ... list of directories/files and some meta data here ...

Resume a Stalled Transfer:

Once in a while a large file transfer might stall or fail (while either using scp or rsync). We can actually use rsync to finish a file transfer!

For this, we can use the --partial flag, which tells rsync to not delete partially transferred files but keep them and attempt to finish its transfer on a next attempt:

$ rsync --partial --progress largefile.ext username@hostname:/path/to/largefile.ext

The Archive Option:

There’s also a -a or --archive option, which is a handy shortcut for the options -rlptgoD:

  • -r – Copy recursively
  • -l – Copy symlinks as symlinks
  • -p – Preserve permissions
  • -t – Preserve modification times
  • -g – Preserve group
  • -o – Preserve owner (User needs to have permission to change owner)
  • -D – Preserve special/device files. Same as --devices --specials. (User needs permissions to do so)
# Copy using the archive option and print some stats
$ rsync -a --stats /source/dir/path username@hostname:/destination/dir/path


1) technique

copy from source

tar -cf – /backup/ | pv | pigz | nc -l 8888

Destination

nc master.active.ai 8888 | pv | pigz -d | tar xf – -C /

2)
time tar -c /backup/ |pv|lz4 -B4| ssh -c aes128-ctr root@192.168.1.73 “lz4 -d |tar -xC /backup”

3) copy files using netcat

4) rysnc

50 MB /SEC

rsync -aHAXWxv –numeric-ids –no-i-r –info=progress2 -e “ssh -T -c chacha20-poly1305@openssh.com,aes192-cbc -o Compression=no -x” /backup/ root@192.168.1.73:/backup/

time rsync -aHAXWxv –numeric-ids –no-i-r –info=progress2 -e “ssh -T -c chacha20-poly1305@openssh.com,aes192-cbc -o Compression=no -x” /backup/ root@192.168.1.73:/backup/


hen copying to the local file system I always use the following rsync options:

# rsync -avhW --no-compress --progress /src/ /dst/

Here’s my reasoning:

-a is for archive, which preserves ownership, permissions etc.
-v is for verbose, so I can see what's happening (optional)
-h is for human-readable, so the transfer rate and file sizes are easier to read (optional)
-W is for copying whole files only, without delta-xfer algorithm which should reduce CPU load
--no-compress as there's no lack of bandwidth between local devices
--progress so I can see the progress of large files (optional)

70 MB / SEC
5) time tar cvf – /backup/* | ssh -T -c chacha20-poly1305@openssh.com,aes192-cbc -o Compression=no -x root@192.168.1.73 “tar xf – -C / ”

time tar cvf – /backup/* | pv | ssh -T -c chacha20-poly1305@openssh.com,aes192-cbc -o Compression=no -x root@192.168.1.73 “tar xf – -C / ”

time tar -cpSf – /backup/* | pv | ssh -T -c chacha20-poly1305@openssh.com,aes192-cbc -o Compression=no -x root@192.168.1.73 “tar xf – -C / ”

 6)
tar cvf - ubuntu.iso | gzip -9 - | split -b 10M -d - ./disk/ubuntu.tar.gz.



#!/bin/bash
# SETUP OPTIONS
export SRCDIR="/folder/path"
export DESTDIR="/folder2/path"
export THREADS="8"
# RSYNC DIRECTORY STRUCTURE
rsync -zr -f"+ */" -f"- *" $SRCDIR/ $DESTDIR/ \
# FOLLOWING MAYBE FASTER BUT NOT AS FLEXIBLE
# cd $SRCDIR; find . -type d -print0 | cpio -0pdm $DESTDIR/
# FIND ALL FILES AND PASS THEM TO MULTIPLE RSYNC PROCESSES
cd $SRCDIR  &&  find . ! -type d -print0 | xargs -0 -n1 -P$THREADS -I% rsync -az % $DESTDIR/%
# IF YOU WANT TO LIMIT THE IO PRIORITY, 
# PREPEND THE FOLLOWING TO THE rsync & cd/find COMMANDS ABOVE:
#   ionice -c2
rsync -zr -f"+ */" -f"- *" -e 'ssh -c arcfour' $SRCDIR/ remotehost:/$DESTDIR/ \
&& \
cd $SRCDIR  &&  find . ! -type d -print0 | xargs -0 -n1 -P$THREADS -I% rsync -az -e 'ssh -c arcfour' % remotehost:/$DESTDIR/% 

Parallelizing rsync

Last week I had a massive hardware failure on one of the GlusterFS storage nodes in the ILRI, Kenya Research Computing cluster: two drives failed simultaneously on the underlying RAID5. As RAID5 can only withstand one drive failure, the entire 31TB array was toast. FML.

After replacing the failed disks, rebuilding the array, and formatting my bricks, I decided I would use rsync to pre-seed my bricks from the good node before bringing glusterd back up.

tl;dr: rsync is amazing, but it’s single threaded and struggles when you tell it to sync large directory hierarchies. Here’s how you can speed it up.

rsync #fail

I figured syncing the brick hierarchy from the good node to the bad node was simple enough, so I stopped the glusterd service on the bad node and invoked:

# rsync -aAXv --delete --exclude=.glusterfs storage0:/path/to/bricks/homes/ storage1:/path/to/bricks/homes/

After a day or so I noticed I had only copied ~1.5TB (over 1 hop on a dedicated 10GbE switch!), and I realized something must be wrong. I attached to the rsync process with strace -p and saw a bunch of system calls in one particular user’s directory. I dug deeper:

# find /path/to/bricks/homes/ukenyatta/maker/genN_datastore/ -type d | wc -l
1398640

So this one particular directory in one user’s home contained over a million other directories and $god knows how many files, and this command itself took several hours to finish! To make matters worse, careful trial and error inspection of other user home directories revealed more massive directory structures as well.

What we’ve learned:

  • rsync is single threaded
  • rsync generates a list of files to be synced before it starts the sync
  • MAKER creates a ton of output files/directories ????

It’s pretty clear (now) that a recursive rsync on my huge directory hierarchy is out of the question!

rsync #winning

I had a look around and saw lots of people complaining about rsync being “slow” and others suggesting tips to speed it up. One very promising strategy was described on this wiki and there’s a great discussion in the comments.

Basically, he describes a clever use of find and xargs to split up the problem set into smaller pieces that rsync can process more quickly.

sync_brick.sh

So here’s my adaptation of his script for the purpose of syncing failed GlusterFS bricks, sync_brick.sh:

#!/usr/bin/env bash
# borrowed / adapted from: https://wiki.ncsa.illinois.edu/display/~wglick/Parallel+Rsync

# RSYNC SETUP
RSYNC_PROG=/usr/bin/rsync
# note the important use of --relative to use relative paths so we don't have to specify the exact path on dest
RSYNC_OPTS="-aAXv --numeric-ids --progress --human-readable --delete --exclude=.glusterfs --relative"
export RSYNC_RSH="ssh -T -c arcfour -o Compression=no -x"

# ENV SETUP
SRCDIR=/path/to/good/brick
DESTDIR=/path/to/bad/brick
# Recommend to match # of CPUs
THREADS=4
BAD_NODE=server1

cd $SRCDIR

# COPY
# note the combination of -print0 and -0!
find . -mindepth 1 -maxdepth 1 -print0 | \ 
    xargs -0 -n1 -P$THREADS -I% \
        $RSYNC_PROG $RSYNC_OPTS "%" $BAD_NODE:$DESTDIR

Pay attention to the source/destination paths, the number of THREADS, and the BAD_NODE name, then you should be ready to roll.

The Magic, Explained

It’s a bit of magic, but here are the important parts:

  • The -aAXv options to rsync tell it to archive, preserve ACLs, and preserve eXtended attributes. Extended attributes are critically important in GlusterFS >= 3.3, and also if you’re using SELinux.
  • The --exclude=.glusterfs option to rsync tells it to ignore this directory at the root of the directory, as the self-heal daemon?—?glustershd?—?will rebuild it based on the files’ extended attributes once we restart the glusterd service.
  • The --relative option to rsync is so we don’t have to bother constructing the destination path, as rsync will imply the path is relative to our destination’s top.
  • The RSYNC_RSH options influence rsync‘s use of SSH, basically telling it to use very weak encryption and disable any unnecessary features for non-interactive sessions (tty, X11, etc).
  • Using find with -mindepth 1 and -maxdepth 1 just means we concentrate on files/directories 1 level below each directory in our immediate hierarchy.
  • Using xargs with -n1 and -P tells it to use 1 argument per command line, and to launch $THREADS number of processes at a time.