Hoshen and Kopelman

Hoshen and Kopelman [#!Hoshen!#] developed a sequential cluster
identification algorithm which gives each cluster a unique label and
counts the number of sites it contains. Each site belonging to
cluster is assigned a cluster label . The sites
in cluster may initially be assigned several different
cluster labels. This is chosen to be the smallest number in the set.
These are given as a set of natural numbers
{
}.
In this set only one number is regarded as the *proper* cluster
label which we designate as . This is the smallest number
in the set.

A connection between the label at any site, and the
proper cluster cluster label , is provided by an array
as
{
}.
This is constructed so that is the only positive
integer member of the set and
denotes the number of sites in the cluster, while the remaining
provide the links between the and the
proper cluster label . To be specific, if a site with label
is bonded to a site which has the proper cluster label
then

Therefore to get the proper cluster label for this site we have to go through using 6.12 then 6.11. Similarly for higher-order indirections, we just need to iterate until we reach a positive value of , which means that is the proper cluster label and the current number of sites in the cluster. Fortunately in most cases this hierarchy consists to one or two levels (one or two equations) only [#!Hoshen!#].

In practice the algorithm works as follows. We sweep through the sites
of the lattice looking at neighbors in the negative directions (there
are of them for a -dimensional lattice). If
site has no connected neighbors (or none of its neighbors have been
labeled) then it is assigned a new label
, and
. If there is only one connected neighbor, at site
with label
, say, then gets the same label
as
, and
. The good feature of this cluster labeling technique becomes
apparent when site links two or more previously labeled cluster
fragments into a single cluster, that is, when it is bonded to two or
more of its neighbors. No site belonging to any of these cluster
fragments is relabeled (so that once a site is labeled it retains this
label throughout the labeling process)--instead the readjustments
occur within the . The number of readjusted
's for a site is equal to the number of coalescing
cluster fragments at that site. Let us assume that the connected sites
belong to clusters
which have proper
cluster labels
with
being the smallest. These clusters coalesce at to
form a single larger cluster so we set label
and
readjust

It is this continual readjustment which keeps the hierarchy of indirections small. As a final stage in the algorithm we can pass through the lattice a second time setting the label at each site to be the proper cluster label . This makes it easier to pick out the clusters for updating.