Linux的企业级应用: July 2008

when ghost meets ghost

This morning, a colleague can't boot her XP
good news is her machines install 2 os , the other is Linux, Cool........

She made ghost of XP , but now ghost can't find the .gho file to recover her XP.
so I boot into Linux mount the partions with .gho and cp to the root of the partions.
but still no found by Ghost.

then I just define .gho file name, the recover starts..............

Some Notes On djbdns

http://cr.yp.to/djbdns/install.html

http://cr.yp.to/daemontools/install.html

http://cr.yp.to/ucspi-tcp/install.html

when I installed daemontools/ucspi-tcp met some errors:

/usr/bin/ld: errno: TLS definition in /lib/libc.so.6 section .tbss mismatches non-TLS reference in envdir.o
/lib/libc.so.6: could not read symbols: Bad value
collect2: ld 返回 1
make: *** [envdir] 错误 1
Copying commands into ./command...
cp: 无法 stat “compile/svscan”: 没有那个文件或目录

wget http://www.qmail.org/netqmail-1.05.tar.gz
tar xfz netqmail-1.05.tar.gz
cd /package/admin/daemontools-0.76
patch < patch/to/daemontools-0.76.errno.patch
cd ucspi-tcp-0.88
patch < ucspi-tcp-0.88.errno.patch

Then all seems ok!

Migrate zimbra4.5.4 To zimbra5.0.x

On my way, on the edge of getting pieces.

the scence:

Now remote: running 4.5.4 on CentOS4.4
target localhost : 5.0.x to CentOS4.4

We have to migrate zimbra to 4.5.11 and then 5.0.x

after some tries, it seems rsync is impossible
migrate the db doesn't work.

go to the forums.........
http://wiki.zimbra.com/index.php?title=User_Migration

new try.

Enterprise Installation of DNS and BIND

http://www.bsdguides.org/guides/freebsd/networking/install_bind

Patch for app & kernel

Applying the Patch

For the purpose of this guide, we will use linux-2.2.x for the kernel name. You should replace the x with the version number of the patch you are installing.

Move the downloaded kernel patch to the /usr/src/linux directory.
cd /usr/src/linux
If you downloaded a patch with a .gz extension, execute the following command:
gunzip patch-2.2.x.gz

If you downloaded a patch with a .bz2 extension, execute the following command:
bunzip2 patch-2.2.x.bz2
There should now be a file called patch-2.2.14 in the /usr/src/linux directory. Apply the patch to the kernel source tree with the following command:
patch -p1 <>

      You should now be ready to set the configuation for the new kernel you wish to build.  See the Configuring a New Kernel guide for more information.

用Diff和Patch工具维护源码
http://www.ibm.com/developerworks/cn/linux/l-diffp/index.html

how to kick out the Robots&Spiders

At first, I ‘d like to thanks for xinbin.chen’s help.
He should be a good teacher who could teach u how to fish not just gave u some fishes.

No more words, let’s get to the point.

There are two popular methods to kick out the Robots & Spiders.

1: put robots.txt in the www root.
the rules are:

#that u want to allow the spider get ur site
User-agent: somespider
Disallow:

#disallow all the spiders get ur site
User-agent: *
Disallow: /

#disallow certain spider get ur site
User-agent: spider
Disallow: /

but now many spider can pretend themselves as FF, Opera, IE.
so we need some other method. and then the 2nd.

2: If u use apache, U got the good idea.
add the following lines to the httpd.conf

At first, I ‘d like to thanks for xinbin.chen’s help.
He should be a good teacher who could teach u how to fish not just gave u some fishes.

No more words, let’s get to the point.

There are two popular methods to kick out the Robots & Spiders.

1: put robots.txt in the www root.
the rules are:

#that u want to allow the spider get ur site
User-agent: somespider
Disallow:

#disallow all the spiders get ur site
User-agent: *
Disallow: /

#disallow certain spider get ur site
User-agent: spider
Disallow: /

but now many spider can pretend themselves as FF, Opera, IE.
so we need some other method. and then the 2nd.

2: If u use apache, U got the good idea.
add the following lines to the httpd.conf

At first, I ‘d like to thanks for xinbin.chen’s help.
He should be a good teacher who could teach u how to fish not just gave u some fishes.

No more words, let’s get to the point.

There are two popular methods to kick out the Robots & Spiders.

1: put robots.txt in the www root.
the rules are:

#that u want to allow the spider get ur site
User-agent: somespider
Disallow:

#disallow all the spiders get ur site
User-agent: *
Disallow: /

#disallow certain spider get ur site
User-agent: spider
Disallow: /

but now many spider can pretend themselves as FF, Opera, IE.
so we need some other method. and then the 2nd.

2: If u use apache, U got the good idea.
add the following lines to the httpd.conf

SetEnvIfNoCase User_Agent Robot a_robot=1
SetEnvIfNoCase User_Agent Spider a_robot=1

the Robot/Spider can change to the spider/robot u want to forbidden.

# omit some lines……………………

Order allow,deny
Allow from all
Deny from env=a_robot

apache graceful , it will work.

The is a story when I did this mission. we have squid for our site. I change the proxy and don’t let it cache the dest site. but it still sth wrong. I don’t know how to reslove it until chen told me some princples of squid.

change the paras.

#this two lines for direct the domain not cache.
acl targetdomain dstdomain .urdomain.com
always_direct allow targetdomain

#this line just tell squid not cache the errors, such as ERROR/ forbidden information.
negative_ttl 0

One Words To One Question

Q: Set Java Env

A: edit ~/.bash_profile

add the following lines

PATH=$PATH:$HOME/bin:/opt/jdk1.6.0_07/bin
CLASSPATH=.:/opt/jdk1.6.0_07/lib
JAVA_HOME=/opt/jdk1.6.0_07
JRE_HOME=/opt/jdk1.6.0_07/jre

run source .bash_profile

GeoIP-City for awstats

we can use qq ip data for awstats to log the City Information.

Note: the words like this , u should change according ur enviorment.
Follow me:

1. download IP data，and we use QQWry.dat
2. download qqhostinof.pm
3. download anaylis QQWry.dat
4. put them in cgi-bin/plugins
5. change qqwry.pl，change ./QQWry.dat as /pathto/QQWry.Dat
6. make change to /etc/awstats/some.....conf，add LoadPlugin=”qqhostinfo”

Now update the awstats, u can find location.

Hypertable+KFS configure logs.

Now, the hypertable can run as bin/start-all -servers.sh local
but can't run as bin/start-all-servers.sh kosmos.

Error Logs:

if conf/hypertable.cfg change the wrong kfs ports, the errors of
Hypertable.Master.log seems like the following.

1215947994 ERROR Hypertable.Master : run (/root/hypertable/hypertable/src/cc/Hypertable/Master/RequestHandlerRegisterServer.cc:49): Hypertable::Exception: Error mkdirs DFS directory /hypertable/tables/METADATA/default - DFS BROKER i/o error
at virtual void Hypertable::DfsBroker::Client::mkdirs(const Hypertable::String&) (/root/hypertable/hypertable/src/cc/DfsBroker/Lib/Client.cc:513)
at virtual void Hypertable::DfsBroker::Client::mkdirs(const Hypertable::String&) (/root/hypertable/hypertable/src/cc/DfsBroker/Lib/Client.cc:510): No route to host

now the kfs up and hypertable.master.log ERROR is :
1215948200 ERROR Hypertable.Master : (/root/hypertable/hypertable/src/cc/Hypertable/Master/Master.cc:451) METADATA update error (row_key = 0: : DFS BROKER i/o error)

so lead to the Hypertable.master can't up

1215948306 ERROR hypertable : (/root/hypertable/hypertable/src/cc/AsyncComm/Comm.cc:207) No connection for 10.0.0.205:38050
1215948306 WARN hypertable : (/root/hypertable/hypertable/src/cc/Hypertable/Lib/MasterClient.cc:224) Comm::send_request to 10.0.0.205:38050 failed - COMM not connected

Some Useful Tools

1.
XAMPP is an easy to install Apache distribution containing MySQL, PHP and Perl.
XAMPP is really very easy to install and to use - just download, extract and start.
http://www.apachefriends.org/en/xampp.html

Fedora Yum source.

http://mirrors.fedoraproject.org/

For China, the following are fast by my test.
U can test by urself. hehe

http://fedora.candishosting.com.cn/pub/fedora/linux
http://opensource.nchc.org.tw/fedora/linux
http://free.nchc.org.tw/fedora/linux/
http://ftp.twaren.net/Linux/Fedora/linux/

How to Make Ur Own Mirror
http://fedoraproject.org/wiki/Infrastructure/Mirroring

How come performance degrade with more data nodes?

How come performance degrade with more data nodes?
Some of you might have noticed that performance might actually drop slightly when you move from a two node cluster to a four node cluster. To understand this we need to understand how information is and tables are organized in MySQL Cluster. Moreover, I will also mention a bit on distribution awareness, a feature that can be used to minimze this performance drop in bigger configurations.

In MySQL Cluster a table divided into fragments and spread on the data nodes.
You have as many fragments (Fx) as you have data nodes (Nx).
If you have four data nodes, then a table will have four fragments:

Table T1= {F1, F2, F3, F4}

These fragments will be laid out on the data nodes as follows:

N1=F1P
N2=F2P
N3=F3P
N4=F4P

I have written FxP, where P indicates the Primary fragment and potential locking conflicts are resolved on the data node having the Primary fragment for the particular data.
But this does not give you any redundancy. For redundancy you need two two replicas (copies) of the fragments (NoOfReplicas=2 in config.ini) :

N1=F1P, F2S
N2=F2P, F1S
N3=F3P, F4S
N4=F4P, F3S

Some explanations:

Nx = data node x
S = secondary fragment (copy).
P = primary fragment, lock conflicts are handled.
Each data node has a transaction coordinator that is active.
Nodes sharing the same information is called a node group. NG0={N1, N2} shares the same information (same fragments), and NG1{N3, N4} shares the same information.

Transactions are started in a round-robin fashion on the cluster from the application, unless you use distribution awareness (startTransaction with a hint). Since transactions are started in round-robin, the transaction might be started on a node not having the particular data! When the the transcation coordinator (who is picked randomly by the application (mysql server of ndbapi program) receives a primary key request (read/write) it calculates a 128-bit hash on the primary key. The first 64-bits tells where in the priamry key hash table the entry should be stored and the second 64-bits identify which fragment the data should be located on. It could be a fragment residing on another node than the TC. If so, the TC has to forward the request (if it is a read it is forwared and the node having the data will reply to the application directly) to that fragment, or initiate the 2PC protocol to update both fragments that should have the data. If the update of a row identified by PK=1 hashes to F1, then both F1P and F1S has to be synchronously updated. The 2PC takes care of that and a little bit more about that later.

Pretend that you want to read data that is located in fragment F1:

If you do a read without a lock (CommittedRead), then we can read from both primary and secondary fragment, hence you have 50% chance on hitting the correct nodes (N1 and N2) having you data.
If you read with a lock, the we can only read from F1P, hence we have 25% chance of hitting the right node (N1)!

And if you want to write (delete, update, insert)

If you write to F1, then the nodes N1 and N2 are involved, since both copies have to be updated using the 2PC protocol.

The more nodes you have, then more node groups you will have and the probability that a read transaction will end up on a node that does not have the information increases.
For eight nodes, two replicas, you have four node groups and only 25% chance that a read without lock will go to the correct nodes.

Now, what you have seen is that performance degrades when you go from two data nodes to four nodes.
If you have two nodes, then a table is divided into two fragments:
Table T1= {F1, F2}

And the information is laid out as (with NoOfReplicas=2):
N1=F1P, F2S
N2=F2P, F1S

Pretend that you want to read data that is located in fragment F1.

If you do a read without a lock (CommittedRead), you have 100% chance starting the transaction on the nodes having the data (N1 and N2 both has the data in F1P or F1S)!
If you read with a lock, the we can only read from F1P, hence we have 50% chance of hitting the right node (N1)!

And if you write:

If you write to F1, then the nodes N1 and N2 are involved, since both copies have to be updated using the 2PC protocol. I will not explain the 2PC protocol much more than saying that the transaction coordinator (TC) (can be on any of the nodes) will send: Prepare phase: TC --> F1P --> F1S --> TC , Commit phase: TC --> F1S --> F1P --> TC.

In this case three different nodes can be involved in a four node cluster, but only two nodes can be involved in the two node cluster, thus reducing the number of messages sent between computers.

To summarize up to this point

Reads becomes more expensive with bigger cluster because the chance getting to the wrong nodes increases, thus an extra (and maximally one) network hop is required.
Writes becomes more a little bit more expensive since there are more network hops involved (the mysqld or the ndbapi/ndbj application starts the transaction on data node N3, but the data you are updating is on data nodes N1 and N2).

Distribution awareness

By using distribution awareness you can get around these problems and minimize the number of network hops. Distribution awareness ensures that the transaction is started on the node that has the data, or it can also be used to write data to a particular set of nodes.

Reads (with locks or without locks) will now have 100% chance getting to the right node
Writes will now always be handled by the data nodes in one particular node group.
Distribution awareness is supported in NDBAPI and NDBJ, is supported from the MySQL Server from MySQL 5.1.22 - 6.3.4 Carrier Grade Edition.

In my next post I plan to illustrate how to use these features from respective access method (SQL or NDBAPI).

Posted by Johan Andersson at

Storage Engine for MySQL--Falcon

MySQL/Sun released a new storage engine "Falcon" in January, 2007. Falcon is a high performance transactional (fully compliant with ACID) storage engine, which is beta at this time (June 2008). In this article, I describe Falcon's features and its architecture in detail.

More details:

http://dev.mysql.com/tech-resources/articles/falcon-in-depth.html

Linux的企业级应用

7.26.2008

when ghost meets ghost

7.25.2008

Some Notes On djbdns

7.22.2008

Migrate zimbra4.5.4 To zimbra5.0.x

Enterprise Installation of DNS and BIND

7.21.2008

Patch for app & kernel

用Diff和Patch工具维护源码

7.16.2008

how to kick out the Robots&Spiders

7.15.2008

One Words To One Question

7.14.2008

GeoIP-City for awstats

7.13.2008

Hypertable+KFS configure logs.

7.11.2008

Some Useful Tools

7.09.2008

Fedora Yum source.

7.08.2008

How come performance degrade with more data nodes?

7.03.2008

Storage Engine for MySQL--Falcon

Previous Posts

Archives

About Me