## Abstract

An infofuse is a combustible fuse in which information is encoded through the patterning of metallic salts, with transmission in the optical range simply associated with burning. The constraints, advantages and unique error statistics of physical chemistry require us to rethink coding and decoding schemes for these systems. We take advantage of the non-binary nature of our signal with a single bit representing one of *N*=7 states to produce a code that, using a single or pair of intensity thresholds, allows the recovery of the intended signal with an arbitrarily high recovery probability, given reasonable assumptions about the distribution of errors in the system. An analysis of our experiments with infofuses shows that the code presented is consistent with these schemes, and encouraging for the field of chemical communication and infochemistry given the vast permutations and combinations of allowable non-binary signals.

## 1. Introduction

Infochemistry is an emerging field, attempting to develop methods of storage and transmission of information using chemical or material means. The advantage of these modes of communication over electronic communication will depend on the speed, reliability and versatility of the transmission, as well as the conditions under which the signal is to be sent (e.g. with no power source available), but remains relatively unexplored, as the maximum rate at which information can be transmitted is limited by the physical length scales and time scales in the system, as well as the noisiness of the channels, which clearly differs significantly from their electronic analogues.

Recent work has shown that it is possible to transmit and receive messages by both chemical and material means with an infofuse (Thomas *et al.* 2009; Kim *et al*. 2010)—a combustible system in which patterned metallic salts encode information, and an infobubble (Hashimoto *et al.* 2009)—a bubble-based microfluidic device in which optical pulses encode information. Infofuses are advantageous in a number of ways: sending a message does not require a separate power supply, and the lightweight, compact and self-contained nature of the fuse allow for great mobility for the sender of the signal (although electrical power is still required to receive). Individual pulses from the metallic salts has extremely narrow spectral widths (0.01 nm) and enables the use of bandpass filters with the narrowest possible spectral widths (1 nm). These narrow filters allow very weak signals to be discerned over the background (Scoog *et al.* 2007). However, reliable transmission using any communication system requires the development of methods to overcome the noise in transmission, which is itself a function of the system. Over the past half century, various error-correcting codes (Peterson & Weldon 1972; Cover & Thomas 1991; Wicker & Bhargava 1994; Cohen *et al.* 1997; MacKay 2003; Mitzenmacher 2009) have been developed and allow for the possibility of accurate signal recovery with minimal loss of information density, for both binary and non-binary systems. In its most elemental form, error correction is accomplished by including redundant check bits that convolve the positions and values of each bit in the signal in a simple manner. While binary communication schemes are the most commonly studied (Hamming 1950; Peterson & Weldon 1972), non-binary alphabets (where each bit can attain one of *N* possible values) may increase the efficiency of these redundant bits in correcting errors (Mitzenmacher 2006, 2009) at the cost of introducing further complexity in both encoding or correction. Here, we focus our attention on the infofuse that uses a triplet code of pulses with a bit taking on *N*=7 possible values and develop a simple error-correcting code tailored to correct the errors that take advantage of the inherent non-binary nature of the system (Mitzenmacher 2006, 2009; rather than the *N*=2 binary signals in common digital communication approaches). The coding scheme introduces redundancy in the transmitted signal, with a message of length *n*+*m* transmitted to communicate an intended signal of length *n* using *m* redundant check bits. Our approach, which combines theory and experiment, shows that this code can recover the intended sequence with near certainty, given some simple but reasonable assumptions about the distribution of insertion errors. We further show that using a pair of thresholds, dividing the data into ‘clear’ and ‘indeterminate’ peaks can increase both the reliability of recovery and reduce the computational complexity of the algorithm. Finally, we show that for the infofuse with a bit taking on *N*=7 possible values, we achieve a good balance between the length of the fuse required to send a message and the efficiency (the ratio *n*/(*n*+*m*)) of the error correction.

## 2. The infofuse

### (a) The experimental system

Infofuses (Thomas *et al.* 2009) are strips of a flammable polymer (nitrocellulose) patterned with spots of thermally emissive salts. An ignited infofuse supports a flame front that propagates along the strip at a roughly constant velocity and successively ignites each spot in turn, thereby emitting optical pulses in time. Infofuses use three distinct alkali metals with very sharp emission spectra: potassium (K, at 766.49 nm), rubidium (Rb, at 780.03 nm) and caesium (Cs, at 852.11 nm; Ralchenko *et al.* 2010). The nitrocellulose strips are of the order of 2 to 3 mm wide, and 0.1 mm thick. The speed of the flame front as it propagates along the fuse depends strongly on the fuse width, but is generally in the range of 1–3 cm s^{−1}. The emission from each of the chemical spots is observed by a telescopic receiver that monitors the emission midpoint of each element. For a sharply defined flame front propagating along a thin fuse, the separation between the applied spots must be larger than the width of the flame, else two spots will ignite simultaneously (which will prevent the transmission of the message). Variations in the shape of the flame front, the finite width of the fuse and non-uniformities in the nitrocellulose will require a yet larger separation between peaks to enable reliable communication. In our experiments, the observed light pulses have a duration of about 100 ms (approx. 0.3 mm wide), and the spacings between subsequent chemical spots are of the order of approximately 1 cm. This separation is significantly larger than the 2–3 mm flame front, and we do not find significant overlap between the emissions from individual spots experimentally.

The detector has excellent range, and clear signals are obtainable from more than 500 m away (with an estimated 1.4 km maximum range). Signals sent within the laboratory (close range of approx. 20 m only) allowed four to five pulses per fuse before falling out of the range of view of the telescope, so long messages sent within the laboratory were broken into multiple fuses. In order to simulate the effect of outdoor conditions for experiments in the laboratory, an optical density (OD) 2 filter was inserted into the telescopic detection device. This filter reduced the incoming light intensity by approximately 99 per cent and simulated the conditions of the infofuse being about 10 times further away. Experiments performed outdoors showed that while detection of the signal was difficult in direct sunlight, if the infofuse was kept out of the sun, the signal was sufficiently above background to be detected during the day. For the purpose of encoding information with these three emitters, we used a scheme that assigns alphanumeric characters to simultaneous combinations of unique optical pulses (Thomas *et al.* 2009): seven (2^{3}−1) unique optical pulses exist for three distinct emitters. Each pulse combination is given a numerical value, with K=0, Cs=1, Rb=2, K–Cs=3, K–Rb=4, Cs–Rb=5 and K–Cs–Rb=6. Two consecutive pulses (giving a total of 49 unique pair combinations) are therefore sufficient to encode each alphanumeric character and some special characters (see the electronic supplementary material for further discussion). This encoding scheme was used in experiments in the laboratory (with the OD 2 filter), while outdoor experiments were intended for benchmarking and did not use an encoded message.

A portable infofuse detection system, in the form of a three-channel hard-limiter receiver, was implemented for long-distance experiments to verify the single-threshold encoding/correction algorithm (figure 1). The main modules of the system consist of three channels of high-sensitivity photo-receivers (PDF10A, Thorlabs Inc.) and peripheral optics that separate and amplify three spectrum emissions from the burning infofuse (767, 78 and 852 nm). A custom printed circuit board is fabricated to perform signal conditioning prior to the error-correction algorithm. The analogue waveforms from optical receivers are first scaled to 5 V by a resistive divider, then a tunable eighth-order Butterworth low-pass filter (LPF; MAX7480 from Maxim IC Inc.) with a cut-off frequency programmed to be around five times the data rate of the infofuse is used to reduce glitches and spikes in the waveforms. After the LPF, a multi-channel voltage comparator (TLC3704 from Texas Instruments Inc.) generates event trigger digital pulses, indicating the existence of certain spectrum emissions. This is performed by comparing the output from each LPF with a predefined reference voltage (*V* _{1}–*V* _{3}). Each reference is independently adjustable to compensate for different sensitivities and incident intensities in each optical channel. Because of the LPF, the spurious fluctuations in the digital output can be reduced. The digital outputs of the comparators are then transmitted via digital buffers (MM74C04, Fairchild Semiconductor Corp.) to a high-speed field-programmable gate array chip for digital signal processing, where an asynchronous algorithm implements message acquisition and decoding. The algorithm puts the digital system into standby mode and only wakes it up for data acquisition if the optical–electrical signal of any of the three photo-receivers crosses the thresholds *V* _{1}–*V* _{3}. In this way, the data-acquisition rate can be automatically adjusted to fit the speed of burning in the infofuse, thus improving the data-transmission efficiency and reducing system power consumption. The results of the decoded messages are displayed on a liquid crystal display (CFAH1604A-YYH-JT, Crystalfontz). In experiments, both the output from the three-channel receiver and the raw data from the detector were available, with the raw data being used to analyse the robustness of the signal detection, particularly in the case of two-threshold error correction (see below). Although only one bit of information is extracted from the three channels (corresponding to each wavelength) using this setup, one could encode more information into the intensity levels of each channel, thereby increasing the number of states. The detection system can be adapted to discern multiple intensity levels, either for higher density encoding or to implement the two-threshold correction system below by adding one more TLC3704 chip to the system.

### (b) Errors in the infofuse

In figure 2*a*, we show a typical sample of the measured intensity of the infofuse associated with the transmission of the message (in this case, the encoding for ‘tufts’, described further below) as a function of time, with many of the intended peaks being sharp, well defined (although of variable intensity) and well above background. However, at the beginning and end of this particular transmission, the noise is much larger (typically the higher noise initially may be owing to the ignition of a match); indeed, decreasing the thresholding level would likely add spurious additional potassium peaks to the beginning and end of the signal and produce insertion errors in the signal. In figure 2*b*, corresponding to the message we see that the intended peak near 8 s (marked as intended in the figure) has a temporal profile similar to the other intended peaks, but with a significantly reduced maximum intensity. This is probably caused either by a misapplication of the metallic spot (too little potassium), or because portions of the pattern did not successfully ignite. The intensities of the noise peaks near 8.5 and 9 s are similar in both intensity and profile to the low intended peak at 8 s, making it difficult to distinguish between signal and noise. The probable cause of these noise peaks are non-uniformity in the flame front or inhomogeneities in the fuse, both of which could delay the ignition of a portion of the applied salt. We note that increasing the concentration of the salt is expected to have little effect on these inhomogeneities, and thus would not likely alter the noise observed at 8.5 or 9 s in figure 2*b*. Improvements in the accuracy and uniformity of the applied solvent would likely partially prevent the drastic reduction in the intensity of intended peaks (as was the case near 8 s in figure 2*b*), but cannot prevent inhomogeneity in emission time.

The three peaks near 8.5 s (occurring at 8.3, 8.45 and 8.6 s) have a very small temporal separation compared with the separation between intended peaks (0.15 and 0.3 s, respectively), and one could reasonably discard the noise peak at 8.45 s based on its proximity to the other peaks. However, if the spacing between spots on the fuse was reduced, then such a large temporal threshold would run the risk of discarding intended peaks as well. The noise peak near 9 s could be discarded (as well below background), treating it as an insertion error (a K–Rb peak followed by a temporally very close Cs peak), or merge with the intended peak (with the intended Cs peak read as a K–Rb–Cs peak). While the particular values of the thresholds in temporal spacing and intensity will have an effect on the noisiness of the signal sent by the infofuse, experimentally, we find that the primary sources of error are associated with the insertion of unintended bits and possible permutations of intended bits. Importantly, we do not have the problem of any *missing* peaks over the range of operation of our receivers, so that we do not need to worry about deletion errors as we develop a robust error-correction scheme for the infofuse.

In order to overcome the noise in transmission, a simple solution is to increase the spacing between the metallic patterns. This will cause all the noisy peaks to be well separated, so that unambiguously determining the applied chemicals is virtually assured without confusion owing to neighbouring peaks. However, this will not only decrease the physical rate of the information transfer, but also the length of the signals that can be sent (as messages must be encoded on a fuse of finite length). If we have *n* intended peaks separated by a distance *d*_{0}, and the average flame front velocity is *v*, then the physical rate is *r*_{0}=*v*/*d*, with a time approximately *n*/*r*_{0} required to transmit the message. By increasing the spacing from *d*_{0} to *d*, we decrease the physical rate of order *nd*_{0}/*d*. In figure 2, the temporal separation between intended peaks is of the order of the width of the chemical patterning. Increasing the spacing by a factor of 2 will significantly separate the peaks, simplifying the process of distinguishing between data and noise, but at the cost of cutting the physical rate in half and doubling the length of the transmission time.

An alternative to simply increasing the distance between peaks is to introduce a self-correcting code into our encoding for the infofuse, which will allow the correct signal to be reconstructed in the presence of noise. Such a code will be preferable to simply increasing the distance between peaks if it is more efficient (i.e. allows a message to be reliably sent at a higher rate). A variety of error-correcting codes have been developed to allow the recovery of a noisy signal under differing noise conditions. In a perfect world, an error-correcting code would be designed so that any code word sent could not be confused for any other possible code word, regardless of the errors that occur (figure 3). The classic Hamming (Hamming 1950; Cover & Thomas 1991) or Golay (Cohen *et al.* 1997) codes, the more commonly used Reed–Solomon codes (Wicker & Bhargava 1994) or the more modern and extremely high rate low-density parity check (LDPC) codes (Gallager 1962; MacKay 1999) allow the correction of up to a fixed number of permutation errors ⌊(*h*−1)/2⌋ (with *h* the minimum or Hamming distance of the code), with 100 per cent probability. All of these codes are effectively implemented for the correction of permutation errors, where the intended bit value *d*_{i} may be permuted via channel noise into an unintended value *d*_{i}′. However, experimentally, we find that the infofuse suffers from insertion errors (where bits may be added to the signal; figure 2), which cannot be handled using these well-known codes. While error-correcting codes that address insertion or deletion errors have been studied (Tenengolts 1984; Klove 1995; Davey & MacKay 2001; Drinea & Mitzenmacher 2007; Mitzenmacher 2009) from a variety of approaches, many are not optimally adaptable to non-binary codes. LDPC codes, which are extremely efficient for long signals, are somewhat inappropriate for the short signals that must be sent via the infofuse, owing to the physical constraints on the length of the fuse. Additionally, codes that are capable of correcting insertions *and* deletions, while only insertions are observed experimentally, are expected to be less efficient than a code that focuses solely on insertion errors.

## 3. Coding scheme

### (a) Check bits

We wish to develop a simple scheme that uses the non-binary nature of the infofuse to correct an arbitrary number of errors with high probability (MacKay 1999). As seen in figure 2*a*, if we choose a single, sufficiently low-intensity threshold, we can be certain of correctly including all data peaks, but must accept that some noise peaks will be inserted. Alternatively, if we choose a pair of thresholds with one sufficiently high, we can be sure that all noise peaks are excluded from the signal, but some data peaks may be excluded as well. These dropped data peaks will join a set of *indeterminate* peaks: intensity peaks which clearly indicate that there was some chemical present (well above background), but whose intensities are not high enough to be deemed ‘clear’ data peaks. In this case, the receiver must be able to accurately determine which of the indeterminate peaks were intended, and which are noise.

To allow error correction in the infofuse, the sender and receiver must agree on the number of data bits, *n*, and the number of check bits, *m*, beforehand. The check bits are chosen to convolve both the position and the value of each data bit in a unique way, with the first *N*−1 bits of the form
3.1where *d*_{i} is the data bit value at the *i*th position and *N*=7 is the number of possible bit values. This simple form for the check bits is suboptimal in many respects, but has the merits of clarity, simplicity and flexibility (see electronic supplementary material for further discussion). It is not difficult to see that the probability of a randomly chosen sequence producing a given set of is *p*_{fail}=*N*^{−m} (similar to the discussion in §8 of Mitzenmacher 2009). This exponentially decaying probability will allow the determination of a robust error-correcting code. Below, we describe the correction scheme for insertion errors (see electronic supplementary material for discussion of permutation errors). While codes that have the ability to correct a single insertion or deletion error with 100 per cent probability convolve data and position with two constraints (Tenengolts 1984; Sloane 2002), we will see the use of multiple check bits and large-alphabet (*N*=7) encoding allows us to correct an arbitrary number of insertion errors with high probability. We note that the advantages of non-binary communication do not depend on the use of an infofuse in particular, but could be applied to any communication system with many states per bit.

### (b) Single-threshold correction

If a single-intensity threshold is used (figure 2), the received signal will be a combination of all *n*+*m* data and check peaks, as well as *k*≥0 noise peak insertions. Our error-correction method will fail if a bit is truly deleted, but by choosing a sufficiently low threshold, we are able to ensure that all intended peaks are accurately recovered. However, this restriction is counterbalanced by the efficiency of our code (see below) that does not require the need to correct errors where an intended bit is deleted. For the moment, we assume the *m* check bits are perfectly recovered, an unrealistic an severe approximation that will be discussed in detail below. The possibility of permutation errors is also neglected here (see electronic supplementary material for further discussion). In order to model the noisy transmission, we assume a uniform probability *p* that a noise peak is inserted following any observed peak in the data block. The number of insertion errors observed in the system then has the distribution , giving 〈*k*〉=*np*/(1−*p*), with a variance 〈*k*^{2}〉−〈*k*〉^{2}=*np*/(1−*p*)^{2}. A receiver who finds a signal with *n*+*k* peaks can simply determine all possible subsequences of length *n* and compare the received check bits with the computed values (where the final *m* bits are assumed correct, see below for further discussion). There will be (*n*+*k* *n*) such sequences, so long sequences with multiple errors have an extremely high computational cost. The probability of recovering only the correct signal given *k* insertion errors is , and the total recovery probability (i.e. the probability that a single, unique sequence is recovered) is
3.2The recovery probability as a function of *m* is shown in figure 4, and it is clear that *P*_{rec} is sigmoidal in nature and increases rapidly beyond the midpoint of the transition. Equation (3.2) incorporates the total number of possible trial sequences rather than unique trial sequences (neglecting any correlations between the trial sequences), and thus will underestimate the probability of recovery.

A sequence containing the average number of insertion errors, *k*=〈*k*〉=*np*/(1−*p*), will require trial sequences, where , similar to the binary entropy function (Cover & Thomas 1991). Because of the exponential growth of the computational complexity of the code, in practical implementations, we must choose the number of data bits *n*, such that the number of expected trial sequences is not too large. We expect the number of errors to scale as *k*_{0}≈〈*k*〉=*np*/(1−*p*), with higher order corrections scaling as , where the proportionality constant is of order unity. It is not difficult to see that *P*_{rec}(*k*_{0}+δ*k*_{0})∼1−*ϵ* when , yielding
3.3with *g*(*n*,*ϵ*) an undetermined function satisfying as . *g* can in principle be determined numerically, but is analytically difficult to determine directly. This argument determines the number of check bits required to recover from *k*_{0}+δ*k*_{0} errors with high probability, and essentially estimates the number of errors that may occur while the overlap in output sequences remains small (as shown in figure 3). The asymptotic values of *m* are shown in figure 4*a* for various values of *p*, and the leading order term in *n* roughly coincides with the midpoint of the transition between low and high recovery probability. As , the dominant contribution to *m* will be the number of check bits required to reach the transition.

### (c) Higher rates using multiple thresholds

Improved recovery statistics can be attained by using a pair of thresholds, dividing the signal into clear, indeterminate and background ranges. Peaks in the indeterminate range will be a mix of intended peaks and noise peaks, and the error-correction scheme must be able to distinguish between the two. As was the case for the single threshold, it is essential that the noise threshold is low enough that we can be certain that all intended peaks are detected. Likewise, we assume the threshold *I*_{cut} is chosen sufficiently high so that no noise peaks ever fall into the clear range, else our coding scheme would fail. The receiver will find *n*−*l* clear peaks, as well as *k*+*l* indeterminate peaks (with *l* the number of intended peaks that are considered indeterminate and *k* the number of insertions). In order to recover the intended signal, all possible sequences of length *n* containing all *n*−*l* clear peaks and *l* of the indeterminate peaks can be generated, and compared with the check bits. Again, this decoding scheme has high computational complexity for long, noisy signals, with trial sequences being generated. However, the number of trial sequences using a pair of thresholds is strictly less than that when a single threshold is used.

We assume that the probability of an intended peak being considered indeterminate is uniform with probability *q*, and maintain the stringent assumption that all *m* check bits are perfectly recovered. The probability of recovering the intended signal uniquely is then
3.4with and *P*_{ins}(*k*) the uniform probability of insertion. Equation (3.4) counts all possible trial sequences, not just unique trial sequences, and thus underestimates the probability of recovery. The average number of indeterminate peaks will be 〈*k*+*l*〉=*nq*+*np*/(1−*p*), and we can perform the same analysis as in the single-threshold case to estimate the number of bits required,
3.5Equation (3.5) is equivalent to equation (3.3) if the probability of dropping an intended peak *q*=1, and decreases with decreasing *q*. In figure 4*b*, we see that for fixed insertion probability *p*, the recovery probability has a sharper transition for far lower *m* as *q* decreases (see electronic supplementary material, figure 1 as well). For decreasing *q*, the indeterminate error correction becomes more reliable than the insertion correction. A pair of thresholds not only decreases the computational complexity, but also greatly increases the reliability of communication.

### (d) Meta-check bits

In our determination of the number of check bits required to correctly recover the intended sequence, we have thus far made the stringent assumption that the check bits are perfectly recovered. The results presented above can be used with minimal modification if we can be sure of recovering the check bits with high probability. This immediately suggests the use of meta-check bits, which perform the same redundancy on the check bits that the check bits perform on the data. The meta bits will have the form , much similar to equation (3.1). Each of these meta-check bits can suffer from insertion or indeterminate errors as well, so an additional set of bits must be used to check all meta bits as well. The multi-layered level of protection can be continued indefinitely (discussed further in the electronic supplementary material), allowing for an iterative protection scheme ensuring reliable recovery.

In the limit of large *n*, the *k*th meta-check block will require *m*_{k}/*n*=(*m*_{0}/*n*)×(*m*_{k−1}/*n*) bits (with *m*_{0} given in equations (3.3) or (3.5)), i.e.
3.6with the rates *R*=*n*/(*n*+*m*)=1−*m*_{0}/*n* of these codes shown in figure 5. The rate vanishes at a finite value of *p* for all *q*, owing to the fact that a sufficiently noisy transmission would require *m*_{0}>*n*. At worst (with *q*=1 in the single-threshold case), the rate vanishes at . Interestingly, for a binary channel with *N*=2, we find , showing the increased efficiency owing to the non-binary coding. The indeterminate error-correction scheme has an even higher rate as *q* (the probability of an intended peak being considered indeterminate) decreases.

It is worth noting that an alternative to our large alphabet encoding (*N*=7) could be replaced with a binary or trinary system of individual peaks (i.e. *K*=0, Cs=1 and Rb=2), which are grouped together in bytes of length *b*. Using three elements, each byte can attain *N*_{byte}=3^{b} states, so it is useful to determine the efficiency of our error-correcting code using such an encoding system. A system that uses *b*=2 (i.e. *N*_{byte}=9) would still require a pair of bytes to represent an alphanumeric character, but would require twice as many peaks as the *N*=7 case to send the message. Equation (3.3) can be adapted in a straightforward fashion, and we find that for the byte-wise trinary signal. Such a method is thus less efficient than the large alphabet encoding used, despite the fact that *N*_{byte}=9>*N*=7, both in the expected rate of the code (figure 5), as well as in the required fuse length to send a message (*b*=2 transmission would require twice the length of fuse to send the same signal). Bytes grouped with *b*≥3 do allow a higher information rate than the large alphabet *N*=7 encoding, but are more inefficient in terms of fuse length (with a *b*=3 encoding having but requiring 1.5 times the number of bits to send the message). The large alphabet encoding presented here thus strikes an excellent balance between length efficiency and error-correction efficiency, important in the context of infochemistry.

## 4. Experimental results

In order to test the applicability of our coding scheme to the experimental system, we sent the signal ‘tufts’ 19 times using three check bits and one meta-check bit (14 total bits, with a rate *n*/(*n*+*m*)=0.71, the encoding scheme is presented in the electronic supplementary material). The intended signal is 10220010010441, and translates to ‘tuftsiz’ with the four additional error-correcting bits. For each experiment, the intensity of each channel (K, Rb and Cs) was rescaled by the maximum intensity. Each message was decoded from the raw data with a noise threshold of at least for all experiments. However, if more than five insertion errors were detected (the received signal comprised more than 19 bits), the noise threshold was increased to to reduce the number of detected errors. In most cases, the threshold of was unnecessarily low, but there were some experiments where the intensity of an intended bit was of the order of this threshold).

This threshold was applied to all experiments to be absolutely certain that all data peaks would be included. A higher threshold would have produced fewer channel errors by ignoring more of the inserted peaks, so error correction with this threshold demonstrates a worst-case scenario. Peaks were considered to occur simultaneously if their maxima were within 0.08 s, and peaks separated by more than 0.11 s from both of their nearest neighbours would be included in the returned sequence (with all other peaks discarded as noise). The temporal thresholding is also lower than is strictly necessary, as the average spacing between intended peaks is approximately 0.3 s. The use of a lower threshold again ensures that no intended peaks are dropped at the cost of additional insertions or permutations. Using these thresholds with the time trace shown in figure 2, we recover a signal with five insertions and one permutation error: 0102200100104040063. Using a pair of thresholds, we recover the signal 0**10220**0**1**0**01**0**4**0**4**03**6**, where bold-faced numbers correspond to intensities above *I*_{cut}, and therefore are assumed intended (but possibly permuted), and the other numbers are peaks with intensities between *I*_{cut} and *I*_{noise}.

There are a variety of improvements that can be made in the signal processing to reduce the number of errors in the received signal. Although not implemented here, improvements can be made on the detection of clear and indeterminate peaks by using energy-averaging or matched filtering of the signal (J. Kusuma 2011, personal communication). Many spurious peaks are detected by searching for local maxima in the intensity profiles (the method we use in this paper), only some of which can be discarded as spurious using a temporal threshold. By integrating the intensity over *M* time windows (where **I**=(*I*_{K},*I*_{Cs},*I*_{Rb}) is the vector of intensities), the spurious local maxima will have a reduced impact on the final digitized signal. The correct values of *T* and *t*_{0} must be selected via an optimization procedure for each fuse, as small variations in the flame front velocity can alter *T*. We note that if two intended peaks are found within the same time interval *T* using the integrated signal, a deletion error would have occurred in the received signal (as the maximum number of peaks is *M*). Our error-correcting code will fail if such a deletion error occurs. The advantages and disadvantages of alternate methods of digitizing the signal (perhaps requiring different methods of error correction) are not clear when compared with the approach presented in this paper.

Because of the limited field of vision of the telescopic detector over short ranges, the signal was divided into three fuses, with five, five and four bits, respectively. This led to increased separation between blocks of bits (figure 2*a*). The very limited range of temporal cutoffs (a range of 0.03 s for discarding peaks as noise) suggests that this increased spacing does not significantly alter the noise statistics. When the two thresholds were used, the threshold for clear peaks was set to (figure 6*b*). In all cases, we first tried to correct the signals without regard for permutation errors, but often found that the signal was not accurately recovered. We then corrected for a single permutation error (see electronic supplementary material for discussion of permutation correction), which ensured the recovery of the correct sequence in all experiments.

In figure 6, we show histograms of the results of the error correction, using a single (figure 6*a*) or pair of thresholds (figure 6*b*). In both, we show the results when correcting for both zero and one permutation error in the same histogram. We divide the experimental results into attempted recoveries for which no trial sequences satisfied the check bits (upward slashes), those that recovered a unique sequence (downward slashes) and those for which multiple sequences were recovered (horizontal slashes). This histogram is plotted as a function of the number of unique trial sequences generated, rather than the sequences expected from the random coding arguments above. The average results are also listed in table 1. We note that in all but one case, if any signals were recovered one of them was the intended ‘tufts’; however, other spurious matches were possible. In all cases, where the unique sequence was not recovered, a permutation error had occurred but was uncorrected. In general, the number of unique trial sequences generated using a pair of thresholds was lower than the number using when a single threshold was used. This is owing to the additional information created by labelling some peaks as ‘certainly’ intended, and greatly reduces the computational complexity of decoding.

In figure 6, all occurrences of zero-recovered sequences corresponded to at least one permutation error occurring. Correcting permutation errors can greatly increase the number of trial sequences that must be sampled, and increases the probability of finding more than one recovered sequence. The unique intended signal ‘tufts’ was recovered as long as the number of unique trial sequences was less than 3418 (giving rise to three matches), while at worst five sequences were recovered (the intended and four spurious matches) for 21 015 unique trials. The largest number of trial sequences that resulted in a unique (and correct) match was 4207. This shows the importance of correlations in the trial sequences, as for a randomly drawn set of trials, we would expect the probability of recovering a unique match as *P*_{rec}<(1−7^{−3})^{4207}≈5×10^{−6} (see equation (3.2)). The random coding argument presented above significantly underestimates the probability of recovery, and the surprisingly small number of recovered sequences given the number of trial sequences shows the importance of correlations between trial sequences. It is worthwhile to note that even in cases where multiple sequences satisfy all of the check bits, the error-correction scheme still produces a drastic reduction in the number of possibly intended messages. In the worst case, with 21 000 trial sequences, the five recovered messages translate to ‘tufts’, ‘saets’, ‘saosr’, ‘tess3’ and ‘_efs*’. While it is clear that the possibility of multiple spurious matches is a limitation of our coding scheme, it is equally clear that the 21 000 possible sequences are reduced to a set of five, for which the english message is clearly distinguishable.

## 5. Conclusions

We have presented a relatively simple method for error correction for messages sent via a burning fuse patterned with metallic salts. Our coding scheme depends on the experimentally observed noise properties of the system: while noise peaks may cause insertion or indeterminate errors (by being indistinguishable from data peaks), no information is truly lost from the system. Even with the simplest possible representation for the redundant error-correcting bits (see electronic supplementary material), the experiments show that our correction scheme is able to recover the intended signal in the presence of noise with high probability. By introducing layering in the error correction, in the form of the meta-check bits, it is possible to achieve arbitrarily high probability of signal recovery, assuming that errors are sufficiently rare. A random-coding argument shows that there is an achievable, finite rate for error correction in an idealized infofuse system. However, the experiments show a much higher probability of recovery than would be expected given the random coding argument, owing to the correlations in trial sequences.

The highly efficient nature of the code is due primarily to the non-binary nature of the encoding in the infofuse. A binary or trinary signal can be encoded by grouping bits into bytes of finite length in order to produce a large number of states per byte, but has an associated cost in increasing the required length of the signal. We have seen that not only does byte-wise trinary communication require effectively doubling the length of the signal, but also that the increase in the average number of errors reduces the efficiency of the code in comparison with a large *N* bit-wise coding. It is worthwhile to note that this code could be applied to a system where a larger *N* is used for encoding, such as increasing the number of emissive salts used in the preparation of the infofuse. Likewise, higher information density could be achieved by using the concentration of salts in each spot to encode information (Thomas *et al.* 2009). While the choice of appropriate thresholds could be more difficult in this situation, we expect that increasing the alphabet size will give an increase in the efficiency in the code (Mitzenmacher 2006).

The particular nature of the noise observed in the infofuse, coupled with the great advantages in large *N* encoding, suggests that there may be great advantages in certain cases in chemical communication. One can envision a multitude of physical or chemical systems in which the number of ‘bits’ can represent many states (large *N*) in a novel form of communication. For each, all that remains is a more complete understanding of the errors caused in transmission, and how to design an error-correcting code that takes advantage of the noise characteristics for each. In some cases, the many well-developed codes designed for binary communication may be adaptable or immediately applicable, but new techniques may be required to fully use the advantages of each system while simultaneously overcoming the inherent disadvantages in the same.

## Acknowledgements

We gratefully acknowledge useful conversations with J. Kusuma regarding methods of digitizing the signal.

- Received May 19, 2011.
- Accepted September 1, 2011.

- This journal is © 2011 The Royal Society