Sign up for one of our paid plans with no obligation. Cancel within 30 days of your sign up and we won't charge you a dime. If you cancel outside of the 30 day period, you'll only be charged for the service through the month you cancel.
Sorry, but we don't issue refunds after your initial 30 day trial and we don't prorate for partial months.
Do I get a 30-day free trial if I upgrade?
Nope. The free 30-day trial is only available to folks who sign up for one of our paid plans. If you move from using our free plan to using one of our paid plans, you're ineligible for the 30-day free trial.
Remember, you can cancel within 30 days, or downgrade your plan later if you need to, so you might as well take advantage of the 30-day free trial with one of our paid plans.
Is there a contract involved?
Nope, Notable is a pay-as-you-go service with no contracts. Pay month-to-month, and if you decide to cancel, you'll be billed for the current month, but you won't be billed again.
Are there per-user fees?
Nope. The prices you see above are all inclusive.
How can I pay for Notable?
Pay for Notable on our secure server with your Visa, Mastercard, or American Express—sorry, no PayPal, purchase orders, COD orders or telephone orders will be accepted.
Can I change plans at any time?
Sure thing. Simply click on the "Account" tab on your dashboard to see your options.
Questions before signing up?
We're happy to answer any questions you have. Just drop us a note and we'll get right back to you.
In a previous Damn Cool Algorithms post, I talked about BK-trees, a clever indexing structure that makes it possible to search for fuzzy matches on a text string based on Levenshtein distance - or any other metric that obeys the triangle inequality. Today, I'm going to describe an alternative approach, which makes it possible to do fuzzy text search in a regular index: Levenshtein automata.
Introduction
The basic insight behind Levenshtein automata is that it's possible to construct a Finite state automaton that recognizes exactly the set of strings within a given Levenshtein distance of a target word. We can then feed in any word, and the automaton will accept or reject it based on whether the Levenshtein distance to the target word is at most the distance specified when we constructed the automaton. Further, due to the nature of FSAs, it will do so in O(n) time with the length of the string being tested. Compare this to the standard Dynamic Programming Levenshtein algorithm, which takes O(mn) time, where m and n are the lengths of the two input words! It's thus immediately apparrent that Levenshtein automaton provide, at a minimum, a faster way for us to check many words against a single target word and maximum distance - not a bad improvement to start with!
Of course, if that were the only benefit of Levenshtein automata, this would be a short article. There's much more to come, but first let's see what a Levenshtein automaton looks like, and how we can build one.
Construction and evaluation
The diagram on the right shows the NFA for a Levenshtein automaton for the word 'food', with maximum edit distance 2. As you can see, it's very regular, and the construction is fairly straightforward. The start state is in the lower left. States are named using a ne style notation, where n is the number of characters consumed so far, and e is the number of errors. Horizontal transitions represent unmodified characters, vertical transitions represent insertions, and the two types of diagonal transitions represent substitutions (the ones marked with a *) and deletions, respectively.
Let's see how we can construct an NFA such as this given an input word and a maximum edit distance. I won't include the source for the NFA class here, since it's fairly standard; for gory details, see the Gist. Here's the relevant function in Python:
def levenshtein_automata(term, k):
nfa = NFA((0, 0))
for i, c in enumerate(term):
for e in range(k + 1):
# Correct character
nfa.add_transition((i, e), c, (i + 1, e))
if e < k:
# Deletion
nfa.add_transition((i, e), NFA.ANY, (i, e + 1))
# Insertion
nfa.add_transition((i, e), NFA.EPSILON, (i + 1, e + 1))
# Substitution
nfa.add_transition((i, e), NFA.ANY, (i + 1, e + 1))
for e in range(k + 1):
if e < k:
nfa.add_transition((len(term), e), NFA.ANY, (len(term), e + 1))
nfa.add_final_state((len(term), e))
return nfa
This should be easy to follow; we're basically constructing the transitions you see in the diagram in the most straightforward manner possible, as well as denoting the correct set of final states. State labels are tuples, rather than strings, with the tuples using the same notation we described above.
Because this is an NFA, there can be multiple active states. Between them, these represent the possible interpretations of the string processed so far. For example, consider the active states after consuming the characters 'f' and 'x':
This indicates there are several possible variations that are consistent with the first two characters 'f' and 'x': A single substitution, as in 'fxod', a single insertion, as in 'fxood', two insertions, as in 'fxfood', or a substitution and a deletion, as in 'fxd'. Also included are several redundant hypotheses, such as a deletion and an insertion, also resulting in 'fxod'. As more characters are processed, some of these possibilities will be eliminated, while other new ones will be introduced. If, after consuming all the characters in the word, an accepting (bolded) state is in the set of current states, there's a way to convert the input word into the target word with two or fewer changes, and we know we can accept the word as valid.
Actually evaluating an NFA directly tends to be fairly computationally expensive, due to the presence of multiple active states, and epsilon transitions (that is, transitions that require no input symbol), so the standard practice is to first convert the NFA to a DFA using powerset construction. Using this algorithm, a DFA is constructed in which each state corresponds to a set of active states in the original NFA. We won't go into detail about powerset construction here, since it's tangential to the main topic. Here's an example of a DFA corresponding to the NFA for the input word 'food' with one allowable error:
Note that we depicted a DFA for one error, as the DFA corresponding to our NFA above is a bit too complex to fit comfortably in a blog post! The DFA above will accept exactly the words that have an edit distance to the word 'food' of 1 or less. Try it out: pick any word, and trace its path through the DFA. If you end up in a bolded state, the word is valid.
I won't include the source for powerset construction here; it's also in the gist for those who care.
Briefly returning to the issue of runtime efficiency, you may be wondering how efficient Levenshtein DFA construction is. We can construct the NFA in O(kn) time, where k is the edit distance and n is the length of the target word. Conversion to a DFA has a worst case of O(2n) time - which leads to a pretty extreme worst-case of O(2kn) runtime! Not all is doom and gloom, though, for two reasons: First of all, Levenshtein automata won't come anywhere near the 2n worst-case for DFA construction*. Second, some very clever computer scientists have come up with algorithms to construct the DFA directly in O(n) time, [SCHULZ2002FAST] and even a method to skip the DFA construction entirely and use a table-based evaluation method!
Indexing
Now that we've established that it's possible to construct Levenshtein automata, and demonstrated how they work, let's take a look at how we can use them to search an index for fuzzy matches efficiently. The first insight, and the approach many papers [SCHULZ2002FAST] [MIHOV2004FAST] take is to observe that a dictionary - that is, the set of records you want to search - can itself be represented as a DFA. In fact, they're frequently stored as a Trie or a DAWG, both of which can be viewed as special cases of DFAs. Given that both the dictionary and the criteria (the Levenshtein automata) are represented as DFAs, it's then possible to efficiently intersect the two DFAs to find exactly the words in the dictionary that match our criteria, using a very simple procedure that looks something like this:
def intersect(dfa1, dfa2):
stack = [("", dfa1.start_state, dfa2.start_state)]
while stack:
s, state1, state2 = stack.pop()
for edge in set(dfa1.edges(state1)).intersect(dfa2.edges(state2)):
state1 = dfa1.next(state1, edge)
state2 = dfa2.next(state2, edge)
if state1 and state2:
s = s + edge
stack.append((s, state1, state2))
if dfa1.is_final(state1) and dfa2.is_final(state2):
yield s
That is, we traverse both DFAs in lockstep, only following edges that both DFAs have in common, and keeping track of the path we've followed so far. Any time both DFAs are in a final state, that word is in the output set, so we output it.
This is all very well if your index is stored as a DFA (or a trie or DAWG), but many indexes aren't: if they're in-memory, they're probably in a sorted list, and if they're on disk, they're probably in a BTree or similar structure. Is there a way we can modify this scheme to work with these sort of indexes, and still provide a speedup on brute-force approaches? It turns out that there is.
The critical insight here is that with our criteria expressed as a DFA, we can, given an input string that doesn't match, find the next string (lexicographically speaking) that does. Intuitively, this is fairly easy to do: we evaluate the input string against the DFA until we cannot proceed further - for example, because there's no valid transition for the next character. Then, we repeatedly follow the edge that has the lexicographically smallest label until we reach a final state. Two special cases apply here: First, on the first transition, we need to follow the lexicographically smallest label greater than character that had no valid transition in the preliminary step. Second, if we reach a state with no valid outwards edge, we should backtrack to the previous state, and try again. This is pretty much the 'wall following' maze solving algorithm, as applied to a DFA.
For an example of this, take a look at the DFA for food(1), above, and consider the input word 'foogle'. We consume the first four characters fine, leaving us in state 3141. The only out edge from here is 'd', while the next character is 'l', so we backtrack one step to the previous state, 21303141. From here, our next character is 'g', and there's an out-edge for 'f', so we take that edge, leaving us in an accepting state (the same state as previously, in fact, but with a different path to it) with the output string 'fooh' - the lexicographically next string in the DFA after 'foogle'.
Here's the Python code for it, as a method on the DFA class. As previously, I haven't included boilerplate for the DFA, which is all here.
def next_valid_string(self, input):
state = self.start_state
stack = []
# Evaluate the DFA as far as possible
for i, x in enumerate(input):
stack.append((input[:i], state, x))
state = self.next_state(state, x)
if not state: break
else:
stack.append((input[:i+1], state, None))
if self.is_final(state):
# Input word is already valid
return input
# Perform a 'wall following' search for the lexicographically smallest
# accepting state.
while stack:
path, state, x = stack.pop()
x = self.find_next_edge(state, x)
if x:
path += x
state = self.next_state(state, x)
if self.is_final(state):
return path
stack.append((path, state, None))
return None
In the first part of the function, we evaluate the DFA in the normal fashion, keeping a stack of visited states, along with the path so far and the edge we intend to attempt to follow out of them. Then, assuming we didn't find an exact match, we perform the backtracking search we described above, attempting to find the smallest set of transitions we can follow to come to an accepting state. For some caveats about the generality of this function, read on...
Also needed is the utility function find_next_edge, which finds the lexicographically smallest outwards edge from a state that's greater than some given input:
def find_next_edge(self, s, x):
if x is None:
x = u'\0'
else:
x = unichr(ord(x) + 1)
state_transitions = self.transitions.get(s, {})
if x in state_transitions or s in self.defaults:
return x
labels = sorted(state_transitions.keys())
pos = bisect.bisect_left(labels, x)
if pos < len(labels):
return labels[pos]
return None
With some preprocessing, this could be made substantially more efficient - for example, by pregenerating a mapping from each character to the first outgoing edge greater than it, rather than using binary search to find it in many cases. I once again leave such optimizations as an exercise for the reader.
Now that we have this procedure, we can finally describe how to search the index with it. The algorithm is surprisingly simple:
Obtain the first element from your index - or alternately, a string you know to be less than any valid string in your index - and call it the 'current' string.
Feed the current string into the 'DFA successor' algorithm we outlined above, obtaining the 'next' string.
If the next string is equal to the current string, you have found a match - output it, fetch the next element from the index as the current string, and repeat from step 2.
If the next string is not equal to the current string, search your index for the first string greater than or equal to the next string. Make this the new current string, and repeat from step 2.
And once again, here's the implementation of this procedure in Python:
def find_all_matches(word, k, lookup_func):
"""Uses lookup_func to find all words within levenshtein distance k of word.
Args:
word: The word to look up
k: Maximum edit distance
lookup_func: A single argument function that returns the first word in the
database that is greater than or equal to the input argument.
Yields:
Every matching word within levenshtein distance k from the database.
"""
lev = levenshtein_automata(word, k).to_dfa()
match = lev.next_valid_string(u'\0')
while match:
next = lookup_func(match)
if not next:
return
if match == next:
yield match
next = next + u'\0'
match = lev.next_valid_string(next)
One way of looking at this algorithm is to think of both the Levenshtein DFA and the index as sorted lists, and the procedure above to be similar to App Engine's "zigzag merge join" strategy. We repeatedly look up a string on one side, and use that to jump to the appropriate place on the other side. If there's no matching entry, we use the result of the lookup to jump ahead on the first side, and so forth. The result is that we skip over large numbers of non-matching index entries, as well as large numbers of non-matching Levenshtein strings, saving us the effort of exhaustively enumerating either of them. Hopefully it's apparrent from the description that this procedure has the potential to avoid the need to evaluate either all of the index entries, or all of the candidate Levenshtein strings.
As a side note, it's not true that for all DFAs it's possible to find a lexicographically minimal successor to any string. For example, consider the successor to the string 'a' in the DFA that recognizes the pattern 'a+b'. The answer is that there isn't one: it would have to consist of an infinite number of 'a' characters followed by a single 'b' character! It's possible to make a fairly simple modification to the procedure outlined above such that it returns a string that's guaranteed to be a prefix of the next string recognized by the DFA, which is sufficient for our purposes. Since Levenshtein DFAs are always finite, though, and thus always have a finite length successor (except for the last string, naturally), we leave such an extension as an exercise for the reader. There are potentially interesting applications one could put this approach to, such as indexed regular expression search, which would require this modification.
Testing
First, let's see this in action. We'll define a simple Matcher class, which provides an implementation of the lookup_func required by our find_all_matches function:
class Matcher(object):
def __init__(self, l):
self.l = l
self.probes = 0
def __call__(self, w):
self.probes += 1
pos = bisect.bisect_left(self.l, w)
if pos < len(self.l):
return self.l[pos]
else:
return None
Note that the only reason we implemented a callable class here is because we want to extract some metrics - specifically, the number of probes made - from the procedure; normally a regular or nested function would be perfectly sufficient. Now, we need a sample dataset. Let's load the web2 dictionary for that:
>>> words = [x.strip().lower().decode('utf-8') for x in open('/usr/share/dict/web2')]
>>> words.sort()
>>> len(words)
234936
We could also use a couple of subsets for testing how things change with scale:
>>> words10 = [x for x in words if random.random() <= 0.1]
>>> words100 = [x for x in words if random.random() <= 0.01]
Working perfectly! Finding the 23 fuzzy matches for 'nice' in the dictionary of nearly 235,000 words required 142 probes. Note that if we assume an alphabet of 26 characters, there are 4+26*4+26*5=238 strings within levenshtein distance 1 of 'nice', so we've made a reasonable saving over exhaustively testing all of them. With larger alphabets, longer strings, or larger edit distances, this saving should be more pronounced. It may be instructive to see how the number of probes varies as a function of word length and dictionary size, by testing it with a variety of inputs:
String length
Max strings
Small dict
Med dict
Full dict
1
79
47 (59%)
54 (68%)
81 (100%)
2
132
81 (61%)
103 (78%)
129 (97%)
3
185
94 (50%)
120 (64%)
147 (79%)
4
238
94 (39%)
123 (51%)
155 (65%)
5
291
94 (32%)
124 (43%)
161 (55%)
In this table, 'max strings' is the total number of strings within edit distance one of the input string, and the values for small, med, and full dict represent the number of probes required to search the three dictionaries (consisting of 1%, 10% and 100% of the web2 dictionary). All the following rows, at least until 10 characters, required a similar number of probes as row 5. The sample input string used consisted of prefixes of the word 'abracadabra'.
Several observations are immediately apparrent:
For very short strings and large dictionaries, the number of probes is not much lower - if at all - than the maximum number of valid strings, so there's little saving.
As the string gets longer, the number of probes required increases significantly slower than the number of potential results, so that at 10 characters, we need only probe 161 of 821 (about 20%) possible results. At commonly encountered word lengths (97% of words in the web2 dictionary are at least 5 characters long), the savings over naively checking every string variation are already significant.
Although the size of the sample dictionaries differs by an order of magnitude, the number of probes required increases only a little each time. This provides encouraging evidence that this will scale well for very large indexes.
It's also instructive to see how this varies for different edit distance thresholds. Here's the same table, for a max edit distance of 2:
String length
Max strings
Small dict
Med dict
Full dict
1
2054
413 (20%)
843 (41%)
1531 (75%)
2
10428
486 (5%)
1226 (12%)
2600 (25%)
3
24420
644 (3%)
1643 (7%)
3229 (13%)
4
44030
646 (1.5%)
1676 (4%)
3366 (8%)
5
69258
648 (0.9%)
1676 (2%)
3377 (5%)
This is also promising: with an edit distance of 2, although we're having to do a lot more probes, it's a much smaller percentage of the number of candidate strings. With a word length of 5 and an edit distance of 2, having to do 3377 probes is definitely far preferable to doing 69,258 (one for every matching string) or 234,936 (one for every word in the dictionary)!
As a quick comparison, a straightforward BK-tree implementation with the same dictionary requires examining 5858 nodes for a string length of 5 and an edit distance of 1 (with the same sample string used above), while the same query with an edit distance of 2 required 58,928 nodes to be examined! Admittedly, many of these nodes are likely to be on the same disk page, if structured properly, but there's still a startling difference in the number of lookups.
One final note: The second paper we referenced in this article, [MIHOV2004FAST] describes a very nifty construction: a universal Levenshtein automata. This is a DFA that determines, in linear time, if any pair of words are within a given fixed Levenshtein distance of each other. Adapting the above scheme to this system is, also, left as an exercise for the reader.
The complete source code for this article can be found here.
Significant background noise; voices are very difficult to hear
$0.50
Light to moderate accents***
$0.25
Very pronounced accents***
$0.50
Verbatim transcription+
$0.25
Technical, scientific or medical terminology
$0.25
Time stamps & time codes every 1-5 minutes
$0.25
Digitizing Fee (for tapes, microcassettes, VHS tapes, or any audio source that must be captured in real-time)
$0.25
Special formatting or instructions++
Contact Us!
Standard Service (3-5 business days)
No Additional Charge!
Rush Service (1-2 business days)
$0.75
24-hour Service
$1.25
Same Day Service
$1.50
* If you do not want individual speakers tracked, we will identify them generically (Male/Female, Moderator/Respondent, Interviewer/Interviewee, etc.)
** We will track individual speakers as Male 1, Male 2, Female 1, Female 2, etc. We can also identify by name or some other term (Interviewer, Interviewee 1, Interviewee 2, etc.).
*** Accents can include U.S. regional dialects as well as foreign and non-American accents.
+Verbatim includes all ums, ahs and stutters.
++We can handle a wide range of requirements, including special layouts, document headings and tags, and some closed captioning XML formats.
Transcription Rates
We've simplified our transcription rates to make it easier than ever.
We still bill strictly per recorded minute of audio. Determine the number of speakers in your audio file. Identify if you have any other variables that may affect your audio (our Account Executives will help you with this if you’re not sure). Choose how quickly you want your transcript back.
Thanks again. You folks are great.
BG — Torrance, CA
Your transcripts are excellent, the process is easy, the communication is helpful, and the cost is reasonable. I will definitely recommend you to fellow doctoral students who are working on interview-based research projects.
KJ — Chicago, IL
You folks did a superb job considering the sound quality in various spots. You are definitely my transcription service from now on.
DB — Green Brook, NJ
You guys were great to work with. Excellent communication and prompt service.
EL — Jersey City, NJ
Your service has been awesome. I'll be recommending you to my colleagues.
KE — Levittown, PA
Thanks for the great turn-around on the transcript...You're doing more than a good job of serving your customer. You're helping to get very important messages out to many others who benefit by receiving them.
JA — Omaha, NE
Many thanks, it is a great comfort to know that you guys are at the end of the internet and have such great turnaround/quality.
KG — Reading, UK
We have been very pleased with the service received from Verbal Ink, and we appreciate your time and efforts.
Have you visited storyofcosmetics.org yet? If not, you should! Learn about how major loopholes in U.S. federal law allow the $50 billion beauty industry to put unlimited amounts of chemicals into personal care products with no required testing, no monitoring of health effects and inadequate labeling requirements—making cosmetics among the least-regulated consumer products on the market. Think twice before putting on that lipstick, you might be putting on lead!
Let the Curricula Begin!
The Story of Stuff Project is excited to announce the release of Let There Be…Stuff? – a six-session curriculum that helps Christian teenagers explore the relationship between their consumption, their faith, and the health of the planet.
To celebrate Earth Day, we’re offering this incredible resource for free to the first 1,000 houses of worship or faith leaders who sign up to download it.
Surprise, surprise: the big cosmetics companies aren’t such big fans of the Safe Cosmetics Act of 2010—legislation introduced yesterday to more strictly regulate their business—or of our new movie. The Personal Care Products Council went so far as to issue a statement calling The Story of Cosmetics a “repugnant and absurd…shockumentary.” Whoa!
Delicious is a social bookmarking service that allows you to tag, save, manage and share Web pages all in one place. With emphasis on the power of the community, Delicious greatly improves how people discover, remember and share on the Internet.
Things you can do with Delicious
Bookmark any site on the Internet, and get to it from anywhere
Instead of having different bookmarks on every computer, Delicious makes it easy to have a single set of
bookmarks kept in sync between all of your computers. Even if you're not on a computer you own, you can still
get to your bookmarks on the Delicious website.
Share your bookmarks, and get bookmarks in return
If your friends use Delicious, you can send them interesting bookmarks that they can check out the next time
they log in. Of course, they can do the same for you. As you explore the site and find interesting users, you
can use our Subscriptions and Network features to keep track of the Delicious tags and users you find most
interesting.
Discover the most useful and interesting bookmarks on the web
See what's hot with Delicious users by checking out our popular tags. By looking at popular bookmarks for a
tag, you'll be able to discover the most interesting bookmarks on the topics you're most interested in. Browse
bookmarks on just about anything from the best programming tips to the most popular travel sites, all in an easy
to read format.
Saving a Bookmark
Saving a Bookmark on Delicious is likely to be a little different than what you're used to. Don't worry, we've made
the process intuitive and you'll find that tags and notes will make your bookmarks much easier to manage.
Depending on which buttons you've added to your browser, you can click the "Tag" or "Bookmark this on
Delicious" button to save a new bookmark. Regardless of the buttons you've chosen, you'll see the following fields.
URL:
The URL field is simply the address of the page you're bookmarking. This should be filled out for you. Only
change this if you know what you're doing.
Title:
If you're using one of our bookmarking tools, this will be prefilled with the title of the page you're saving.
Feel free to edit this in any way that makes sense to you.
Notes:
Here's where you may want to write some additional info for yourself or to let others know why you
bookmarked this page.
Tags:
Enter one or more tags separated by spaces here. They are optional, but we suggest using them, as they
make your bookmarks much easier to organize and navigate. For more on tags, read our tags section.
Send:
People and places you want to share your Bookmark with. This could be the social network Twitter, an email address, or a Delicious user
Message:
A message that will appear with your Tweet, Email, or Delicious message. Limited to 116 characters.
Once a bookmark is saved, this is how it looks in Delicious:
Getting Around
We've organized the site into three main sections: Bookmarks, People and Tags. When you are logged in, these
menus give you quick access to your Bookmarks, Network and Tags pages.
Our search allows you to not only search your own bookmarks, but virtually any context in Delicious. For example,
you can now search a bundle, a tag, your Network, or another person's bookmarks from our handy search box in
the header.
On most pages, our submenus allow you to quickly view Bookmarks, Network, Tags or Subscriptions for yourself
or anyone else. Create a public profile, including a display name, to change how you appear in Delicious to
others.
Use our Tag Bar to quickly navigate your bookmarks by using tags and tag combinations. Just start typing -
autocomplete will help you find the tag you want. You can also sort your bookmarks in a variety of ways to make
your browsing easier.
On the right side of the screen, our sidebars help you navigate tags and people quickly and easily.
You can quickly see what a
person's bookmark
collection is all about by
looking at the top ten tags
in their sidebar.
From the sidebar, you can
now refine your view to
show all bookmarks in a
specific bundle. You can
now bundle your network
and subscriptions too!
Confused about who
'stlhood' is? Give any user
in your Network a private
nickname.
Try it out!
The best way to learn how to use Delicious is to use Delicious. We designed it to be fun and easy to use, and we
hope you find that to be the case. If you need help with anything, check out our help pages or contact us.
Nightlights Background in Ultramarine Blue by BackgroundsEtc
Watch this: webtreats posted a photo:
Download our free Nightlights wallpaper backgrounds in high quality, 1440px by 900px and comes in 120 different colors and styles. View the full set here:
backgrounds.mysitemyway.com/free-nightlights-stock-backgr…Nightlights Background in Ultramarine Blue by BackgroundsEtc
Standing on the 2nd floor. Joe and Edmond are over for a short visit. Squeezing my fists as tight as I can. How many birds in the hand are worth one in the bush?
Check this out: Lego NXT wall-e transformable fully self controlled, it uses Lego Mindstorms programming environment. The video shows the transformation which is quite similar to the original wall-e. It is or all I know the first in Lego build look-alike which is capable to transform automated. For more information sites.google.com For questions, comments email at nxtwallet@gmail.com .NXT Wall-e Transformable
Timea Bacsinszky – books place in final. Second seed Timea Bacsinszky is through to her second WTA Tour final after coming from a set down to beat Yvonne Meusberger 1-6 6-4 6-3 in the semi-finals of the Nurnberger Gastenladies in Bad Gastein. And this: Yes, comic books are still a part of Comic-Con. Although film, TV, and games are garnering more and more attention each year, visitors who attend Comic-Con to meet their favorite comic book and graphic novel creators are usually never disappointed. As well as: Earlier this week Amazon.com announced that for the last three months their sales of e-books outnumbered hardcover books. Contributor: Joseph Langeneckert Published: Jul 25, 2010 original. Blog for doonwexeeeq
Three design/build teams vying for the aquatic center project in Bolivar made presentations to the Bolivar Board of Aldermen Thursday, July 15. The aldermen voted to proceed with contract negotiations with the team led by Wirt-Flavin Construction. original. Visit grutgruooey
This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. And this: The ratings agency Fitch has warned a double-dip contraction could be on the way for Dubai’s housing market. As well as: A group of women in Italy, who claim to have had relationships with Catholic priests, have asked Pope Benedict to rethink the church’s policy of celibacy. original. Blog for ergaramonven
This video is from the Alpha version of the game! Don’t forget to subscribe for upcoming videos! Released: 2010 Genre: Action (Shooter) / 3D / 1st Person (FPS) Developer: UDK Publisher: The Original StudiosFPS Terminator 2010 – PC Gameplay HD
Power Integrations releases energy-efficient standby power supply reference designs using new TOPSwitch-JX
What Ive read today:
Idea Integration is pleased to announce its participation in the Microsoft Upstream Reference Architecture Initiative , which will provide energy companies with the basic framework for building effective IT systems while leveraging the expertise of various technology firms in an effort to optimize the IT components of upstream exploration, drilling, production, and financial management. And this: Power Integrations has published two standby power supply reference designs using its recently announced TOPSwitch-JX IC product family. TOPSwitch-JX devices feature multi-mode control, minimizing power wasted in standby and delivering maximum efficiency over a wide range of operating loads. As well as: Wipro’s x73 Manager USB Reference System was recently awarded Global Certification by Continua Health Alliance, a non-profit, collaborative industry organisation dedicated to defining technology standards specifically for healthcare solutions. original. certcunud
Custom Toronto is a promotional products company specializing in customized promotional items. Featuring Custom Tote bags, Pens , key chains and much, much more. Perfect, for Tradeshows, Corporate gifts and Conferences. As promotional items experts we stand behind our quality products and services. Let us help you with your company logo merchandise needs. Give us a call or drop us an email!
Customer Testimonials
Wow! I'm impressed - I was really concerned about my artwork as all I had was a bad scan of it. Your artwork department worked wonders and my custom pens look great! I give them away as corporate gifts for my business and always get complements, Thanks again!Tracy L.- North York
Custom Toronto made it really easy to order our eco friendly custom tote bags! From design to delivery, you guys walked me though each step and kept me informed of my order. Thanks again - and you'll hear from us next year for sure!Tom C. - Scarborough
I've dealt with other promotional companies before and your customer service is the best! Emails were answered within an hour and notification and communication were always right on. Not to mention the custom mugs came on time and look great, thanks!David I. - Markham
This was my first promotional product order and you guys really walked me though the process and made it easy. All I really had to do was send you guys my artwork! The email response was almost instant, thanks again - it was a great experience. Shawn R. - Mississauga
Customer service is the best of any company I’ve dealed with, you guys are always quick to respond and go the extra mile to help a customer out. The custom tote bags came out great and everybody at the tradeshow loved them! I've gladly referred many folks to you guys as well.Michelle L. - Toronto
The premium custom metal pens are great! I give them to our business clients, and always receive compliments on them. Many clients tell me it’s now their favorite everyday pen- which, as you can guess I just thrilled about. Thanks guys!Dominqie F. - Toronto
For Advertising
Promotional Products are the best way to advertise! A recent study released in November 2008 showed that advertising specialties beat out all forms of TV, radio and print as the most cost-effective method of advertising. The study showed that 84% of people interviewed who had received a promotional product remembered the name of the advertiser’s item. 42% of the respondents had a more favorable impression of the advertiser after receiving the item. The majority (62%) of the respondents had done business with the advertiser after receiving a promotional item from them. Click Here to read the study (PDF)
Servicing all of Onario
We service all of Ontario, including, but not limited to the following cities - if in doubt drop us an email or give us a call!
This document describes the XML schema for the Sitemap protocol.
The Sitemap protocol format consists of XML tags. All data values in a Sitemap must
be entity-escaped. The file itself must be UTF-8 encoded.
The Sitemap must:
Begin with an opening <urlset> tag and
end with a closing </urlset> tag.
Specify the namespace (protocol standard) within the <urlset>
tag.
Include a <url> entry for each URL, as
a parent XML tag.
Include a <loc> child entry for each
<url> parent tag.
All other tags are optional. Support for these optional tags may vary among search
engines. Refer to each search engine's documentation for details.
Also, all URLs in a Sitemap must be from a single host, such as www.example.com
or store.example.com. For further details, refer the Sitemap file
location
Sample XML Sitemap
The following example shows a Sitemap that contains just one URL and uses all optional
tags. The optional tags are in italics.
Encapsulates the file and references the current protocol standard.
<url>
required
Parent tag for each URL entry. The remaining tags are children of this tag.
<loc>
required
URL of the page. This URL must begin with the protocol (such as http) and end with
a trailing slash, if your web server requires it. This value must be less than 2,048
characters.
<lastmod>
optional
The date of last modification of the file. This date should be in
W3C Datetime format. This format allows you to omit the time portion, if
desired, and use YYYY-MM-DD.
Note that this tag is separate from the If-Modified-Since (304) header the server
can return, and search engines may use the information from both sources differently.
<changefreq>
optional
How frequently the page is likely to change. This value provides general information
to search engines and may not correlate exactly to how often they crawl the page.
Valid values are:
always
hourly
daily
weekly
monthly
yearly
never
The value "always" should be used to describe documents that change each time they
are accessed. The value "never" should be used to describe archived URLs.
Please note that the value of this tag is considered a hint and not a command.
Even though search engine crawlers may consider this information when making decisions,
they may crawl pages marked "hourly" less frequently than that, and they may crawl
pages marked "yearly" more frequently than that. Crawlers may periodically crawl
pages marked "never" so that they can handle unexpected changes to those pages.
<priority>
optional
The priority of this URL relative to other URLs on your site. Valid values range
from 0.0 to 1.0. This value does not affect how your pages are compared to pages
on other sites—it only lets the search engines know which pages you deem most
important for the crawlers.
The default priority of a page is 0.5.
Please note that the priority you assign to a page is not likely to influence the
position of your URLs in a search engine's result pages. Search engines may use
this information when selecting between URLs on the same site, so you can use this
tag to increase the likelihood that your most important pages are present in a search
index.
Also, please note that assigning a high priority to all of the URLs on your site
is not likely to help you. Since the priority is relative, it is only used to select
between URLs on your site.
Your Sitemap file must be UTF-8 encoded (you can generally do this when you save
the file). As with all XML files, any data values (including URLs) must use entity
escape codes for the characters listed in the table below.
Character
Escape Code
Ampersand
&
&
Single Quote
'
'
Double Quote
"
"
Greater Than
>
>
Less Than
<
<
In addition, all URLs (including the URL of your Sitemap) must be URL-escaped and
encoded for readability by the web server on which they are located. However, if
you are using any sort of script, tool, or log file to generate your URLs (anything
except typing them in by hand), this is usually already done for you. Please check
to make sure that your URLs follow the
RFC-3986 standard for URIs, the RFC-3987
standard for IRIs, and the XML standard.
Below is an example of a URL that uses a non-ASCII character (ü),
as well as a character that requires entity escaping (&):
http://www.example.com/ümlat.php&q=name
Below is that same URL, ISO-8859-1 encoded (for hosting on a server that uses that
encoding) and URL escaped:
http://www.example.com/%FCmlat.php&q=name
Below is that same URL, UTF-8 encoded (for hosting on a server that uses that encoding)
and URL escaped:
http://www.example.com/%C3%BCmlat.php&q=name
Below is that same URL, but also entity escaped:
http://www.example.com/%C3%BCmlat.php&q=name
Sample XML Sitemap
The following example shows a Sitemap in XML format. The Sitemap in the example
contains a small number of URLs, each using a different set of optional parameters.
Using Sitemap index files (to group multiple sitemap
files)
You can provide multiple Sitemap files, but each Sitemap file that you provide must
have no more than 50,000 URLs and must be no larger than 10MB (10,485,760 bytes).
If you would like, you may compress your Sitemap files using gzip to reduce your
bandwidth requirement; however the sitemap file once uncompressed must be no larger
than 10MB. If you want to list more than 50,000 URLs, you must create multiple Sitemap
files.
If you do provide multiple Sitemaps, you should then list each Sitemap file in a
Sitemap index file. Sitemap index files may not list more than 50,000 Sitemaps and
must be no larger than 10MB (10,485,760 bytes) and can be compressed. You can have
more than one Sitemap index file. The XML format of a Sitemap index file is very
similar to the XML format of a Sitemap file.
The Sitemap index file must:
Begin with an opening <sitemapindex>
tag and end with a closing </sitemapindex> tag.
Include a <sitemap> entry
for each Sitemap as a parent XML tag.
Include a <loc> child entry for
each <sitemap> parent tag.
The optional <lastmod> tag
is also available for Sitemap index files.
Note: A Sitemap index file can only specify Sitemaps that are found
on the same site as the Sitemap index file. For example, http://www.yoursite.com/sitemap_index.xml
can include Sitemaps on http://www.yoursite.com but not on http://www.example.com
or http://yourhost.yoursite.com. As with Sitemaps, your Sitemap index file must
be UTF-8 encoded.
Sample XML Sitemap
Index
The following example shows a Sitemap index that lists two Sitemaps:
Note: Sitemap URLs, like all values in your XML files, must be
entity escaped.
Sitemap
Index XML Tag Definitions
Attribute
Description
<sitemapindex>
required
Encapsulates information about all of the Sitemaps in the file.
<sitemap>
required
Encapsulates information about an individual Sitemap.
<loc>
required
Identifies the location of the Sitemap.
This location can be a Sitemap, an Atom file, RSS file or a simple text file.
<lastmod>
optional
Identifies the time that the corresponding Sitemap file was modified. It does not
correspond to the time that any of the pages listed in that Sitemap were changed.
The value for the lastmod tag should be in
W3C Datetime format.
By providing the last modification timestamp, you enable search engine crawlers
to retrieve only a subset of the Sitemaps in the index i.e. a crawler may only retrieve
Sitemaps that were modified since a certain date. This incremental Sitemap fetching
mechanism allows for the rapid discovery of new URLs on very large sites.
The Sitemap protocol enables you to provide details about your pages to search engines,
and we encourage its use since you can provide additional information about site
pages beyond just the URLs. However, in addition to the XML protocol, we support
RSS feeds and text files, which provide more limited information.
Syndication feed
You can provide an RSS (Real Simple Syndication) 2.0 or Atom 0.3 or 1.0 feed. Generally,
you would use this format only if your site already has a syndication feed. Note
that this method may not let search engines know about all the URLs in your site,
since the feed may only provide information on recent URLs, although search engines
can still use that information to find out about other pages on your site during
their normal crawling processes by following links inside pages in the feed. Make
sure that the feed is located in the highest-level directory you want search engines
to crawl. Search engines extract the information from the feed as follows:
<link> field - indicates the URL
modified date field (the <pubDate> field for RSS feeds and the <modified>
date for Atom feeds) - indicates when each URL was last modified. Use of
the modified date field is optional.
Text file
You can provide a simple text file that contains one URL per line. The text file
must follow these guidelines:
The text file must have one URL per line. The URLs cannot contain embedded new lines.
You must fully specify URLs, including the http.
Each text file can contain a maximum of 50,000 URLs and must be no larger than 10MB
(10,485,760 bytes). If you site includes more than 50,000 URLs, you can separate
the list into multiple text files and add each one separately.
The text file must use UTF-8 encoding. You can specify this when you save the file
(for instance, in Notepad, this is listed in the Encoding menu of the Save As dialog
box).
The text file should contain no information other than the list of URLs.
The text file should contain no header or footer information.
If you would like, you may compress your Sitemap text file using gzip to reduce
your bandwidth requirement.
You can name the text file anything you wish. Please check to make sure that your
URLs follow the RFC-3986 standard
for URIs, the RFC-3987 standard
for IRIs
You should upload the text file to the highest-level directory you want search engines
to crawl and make sure that you don't list URLs in the text file that are located
in a higher-level directory.
The location of a Sitemap file determines the set of URLs that can be included in
that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can
include any URLs starting with http://example.com/catalog/ but can not include URLs
starting with http://example.com/images/.
If you have the permission to change http://example.org/path/sitemap.xml, it is
assumed that you also have permission to provide information for URLs with the prefix
http://example.org/path/. Examples of URLs considered valid in http://example.com/catalog/sitemap.xml
include:
Note that this means that all URLs listed in the Sitemap must use the same protocol
(http, in this example) and reside on the same host as the Sitemap. For instance,
if the Sitemap is located at http://www.example.com/sitemap.xml, it can't include
URLs from http://subdomain.example.com.
URLs that are not considered valid are dropped from further consideration. It is
strongly recommended that you place your Sitemap at the root directory of your web
server. For example, if your web server is at example.com, then your Sitemap index
file would be at http://example.com/sitemap.xml. In certain cases, you may need
to produce different Sitemaps for different paths (e.g., if security permissions
in your organization compartmentalize write access to different directories).
If you submit a Sitemap using a path with a port number, you must include that port
number as part of the path in each URL listed in the Sitemap file. For instance,
if your Sitemap is located at http://www.example.com:100/sitemap.xml, then each
URL listed in the Sitemap must begin with http://www.example.com:100.
Sitemaps & Cross
Submits
To submit Sitemaps for multiple hosts from a single host, you need to "prove" ownership
of the host(s) for which URLs are being submitted in a Sitemap. Here's an example.
Let's say that you want to submit Sitemaps for 3 hosts:
www.host1.com with Sitemap file sitemap-host1.xml
www.host2.com with Sitemap file sitemap-host2.xml
www.host3.com with Sitemap file sitemap-host3.xml
Moreover, you want to place all three Sitemaps on a single host: www.sitemaphost.com.
So the Sitemap URLs will be:
By default, this will result in a "cross submission" error since you are trying
to submit URLs for www.host1.com through a Sitemap that is hosted on www.sitemaphost.com
(and same for the other two hosts). One way to avoid the error is to prove that
you own (i.e. have the authority to modify files) www.host1.com. You can do this
by modifying the robots.txt file on www.host1.com to point to the Sitemap on www.sitemaphost.com.
In this example, the robots.txt file at http://www.host1.com/robots.txt would contain
the line "Sitemap: http://www.sitemaphost.com/sitemap-host1.xml". By modifying the
robots.txt file on www.host1.com and having it point to the Sitemap on www.sitemaphost.com,
you have implicitly proven that you own www.host1.com. In other words, whoever controls
the robots.txt file on www.host1.com trusts the Sitemap at http://www.sitemaphost.com/sitemap-host1.xml
to contain URLs for www.host1.com. The same process can be repeated for the other
two hosts.
Now you can submit the Sitemaps on www.sitemaphost.com.
When a particular host's robots.txt, say http://www.host1.com/robots.txt, points
to a Sitemap or a Sitemap index on another host; it is expected that for each of
the target Sitemaps, such as http://www.sitemaphost.com/sitemap-host1.xml, all the
URLs belong to the host pointing to it. This is because, as noted earlier, a Sitemap
is expected to have URLs from a single host only.
There are a number of tools available to help you validate the structure of your
Sitemap based on this schema. You can find a list of XML-related tools at each of
the following locations:
Once you have created the Sitemap file and placed it on your webserver, you need
to inform the search engines that support this protocol of its location. You can
do this by:
The search engines can then retrieve your Sitemap and make the URLs available to
their crawlers.
Submitting your Sitemap via the search
engine's submission interface
To submit your Sitemap directly to a search engine, which will enable you to receive
status information and any processing errors, refer to each search engine's documentation.
Specifying the Sitemap location in
your robots.txt file
You can specify the location of the Sitemap using a robots.txt file. To do this,
simply add the following line including the full URL to the sitemap:
Sitemap: http://www.example.com/sitemap.xml
This directive is independent of the user-agent line, so it doesn't matter where
you place it in your file. If you have a Sitemap index file, you can include the
location of just that file. You don't need to list each individual Sitemap listed
in the index file.
You can specify more than one Sitemap file per robots.txt file.
To submit your Sitemap using an HTTP request (replace <searchengine_URL> with
the URL provided by the search engine), issue your request to the following URL:
<searchengine_URL>/ping?sitemap=sitemap_url
For example, if your Sitemap is located at http://www.example.com/sitemap.gz, your
URL will become:
You can issue the HTTP request using wget, curl, or another mechanism of your choosing.
A successful request will return an HTTP 200 response code; if you receive a different
response, you should resubmit your request. The HTTP 200 response code only indicates
that the search engine has received your Sitemap, not that the Sitemap itself or
the URLs contained in it were valid. An easy way to do this is to set up an automated
job to generate and submit Sitemaps on a regular basis. Note: If you are providing a Sitemap index file, you only need
to issue one HTTP request that includes the location of the Sitemap index file;
you do not need to issue individual requests for each Sitemap listed in the index.
The Sitemaps protocol enables you to let search engines know what content you would
like indexed. To tell search engines the content you don't want indexed, use a robots.txt
file or robots meta tag. See robotstxt.org
for more information on how to exclude content from search engines.
July 30, 2010 (Last Friday Of July) 11th
Annual
System Administrator Appreciation Day
A sysadmin unpacked the server for this website from
its box, installed an operating system, patched it for security, made
sure the power and air conditioning was working in the server room,
monitored it for stability, set up the software, and kept backups in
case anything went wrong. All to serve this webpage.
A sysadmin installed the routers, laid the cables,
configured the networks, set up the firewalls, and watched and guided
the traffic for each hop of the network that runs over copper, fiber
optic glass, and even the air itself to bring the Internet to your computer.
All to make sure the webpage found its way from the server to your computer.
Fig. 1 Ted.
A sysadmin makes sure your network connection is safe,
secure, open, and working. A sysadmin makes sure your
computer is working in a healthy way on a healthy network. A
sysadmin takes backups to guard against disaster both human
and otherwise, holds the gates against security threats and crackers,
and keeps the printers going no matter how many copies of the tax code
someone from Accounting prints out.
A sysadmin worries about spam, viruses, spyware, but
also power outages, fires and floods.
When the email server goes down at 2 AM on a Sunday, your sysadmin
is paged, wakes up, and goes to work.
A sysadmin is a professional, who plans, worries, hacks,
fixes, pushes, advocates, protects and creates good computer networks,
to get you your data, to help you do work -- to bring the potential
of computing ever closer to reality.
So if you can read this, thank your sysadmin -- and
know he or she is only one of dozens or possibly hundreds whose work
brings you the email from your aunt on the West Coast, the instant message
from your son at college, the free phone call from the friend in Australia,
and this webpage.
Show your
appreciation
Friday, July 30, 2010, is the 11th annual
System Administrator Appreciation Day.
On this special international day, give your System Administrator something
that shows that you truly appreciate their hard work and dedication.
(All day Friday, 24 hours, your own local time-zone).
Let's face it, System
Administrators get no respect 364 days a year. This is the day that
all fellow System Administrators across the globe, will be showered
with expensive sports cars and large piles of cash in appreciation of
their diligent work. But seriously, we are asking for a nice token gift
and some public acknowledgement. It's the least you could do.
Consider all the
daunting tasks and long hours (weekends too.) Let's be honest, sometimes
we don't know our System Administrators as well as they know us. Remember
this is one day to recognize your System Administrator for their workplace
contributions and to promote professional excellence. Thank them for
all the things they do for you and your business.
Help Spread the
word
These icons link
to social bookmarking sites where readers can share and discover new
web pages. Please click on your favorites and help spread the news about
System Administrator Appreciation Day.
Ant's buildfiles are written in XML. Each buildfile contains one project
and at least one (default) target. Targets contain task elements.
Each task element of the buildfile can have an id attribute and
can later be referred to by the value supplied to this. The value has
to be unique. (For additional information, see the
Tasks section below.)
the default target to use when no target is supplied.
No; however, since Ant 1.6.0,
every project includes an implicit target that contains any and
all top-level tasks and/or types. This target will always be
executed as part of the project's initialization, even when Ant is
run with the -projecthelp option.
basedir
the base directory from which all path calculations are
done. This attribute might be overridden by setting
the "basedir"
property beforehand. When this is done, it must be omitted in the
project tag. If neither the attribute nor the property have
been set, the parent directory of the buildfile will be used.
No
Optionally, a description for the project can be provided as a
top-level <description> element (see the description type).
Each project defines one or more targets.
A target is a set of tasks you want
to be executed. When starting Ant, you can select which target(s) you
want to have executed. When no target is given,
the project's default is used.
A target can depend on other targets. You might have a target for compiling,
for example, and a target for creating a distributable. You can only build a
distributable when you have compiled first, so the distribute target
depends on the compile target. Ant resolves these dependencies.
It should be noted, however, that Ant's depends attribute
only specifies the order in which targets should be executed - it
does not affect whether the target that specifies the dependency(s) gets
executed if the dependent target(s) did not (need to) run.
More information can be found in the
dedicated manual page.
A task can have multiple attributes (or arguments, if you prefer). The value
of an attribute might contain references to a property. These references will be
resolved before the task is executed.
All tasks share a task name attribute. The value of
this attribute will be used in the logging messages generated by
Ant.
Tasks can be assigned an id attribute:
<taskname id="taskID" ... />
where taskname is the name of the task, and taskID is
a unique identifier for this task.
You can refer to the
corresponding task object in scripts or other tasks via this name.
For example, in scripts you could do:
<script ... >
task1.setFoo("bar");
</script>
to set the foo attribute of this particular task instance.
In another task (written in Java), you can access the instance via
project.getReference("task1").
Note1: If "task1" has not been run yet, then
it has not been configured (ie., no attributes have been set), and if it is
going to be configured later, anything you've done to the instance may
be overwritten.
Note2: Future versions of Ant will most likely not
be backward-compatible with this behaviour, since there will likely be no
task instances at all, only proxies.
Properties are an important way to customize a build process or
to just provide shortcuts for strings that are used repeatedly
inside a build file.
In its most simple form properties are defined in the build file
(for example by the property
task) or might be set outside Ant. A property has a name and a
value; the name is case-sensitive. Properties may be used in the
value of task attributes or in the nested text of tasks that support
them. This is done by placing the property name between
"${" and "}" in the
attribute value. For example, if there is a "builddir"
property with the value "build", then this could be used
in an attribute like this: ${builddir}/classes. This
is resolved at run-time as build/classes.
With Ant 1.8.0 property expansion has become much more powerful
than simple key value pairs, more details can be
found in the concepts section of this
manual.
<project name="MyProject" default="dist" basedir=".">
<description>
simple example build file
</description>
<!-- set global properties for this build -->
<property name="src" location="src"/>
<property name="build" location="build"/>
<property name="dist" location="dist"/>
<target name="init">
<!-- Create the time stamp -->
<tstamp/>
<!-- Create the build directory structure used by compile -->
<mkdir dir="${build}"/>
</target>
<target name="compile" depends="init"
description="compile the source " >
<!-- Compile the java code from ${src} into ${build} -->
<javac srcdir="${src}" destdir="${build}"/>
</target>
<target name="dist" depends="compile"
description="generate the distribution" >
<!-- Create the distribution directory -->
<mkdir dir="${dist}/lib"/>
<!-- Put everything in ${build} into the MyProject-${DSTAMP}.jar file -->
<jar jarfile="${dist}/lib/MyProject-${DSTAMP}.jar" basedir="${build}"/>
</target>
<target name="clean"
description="clean up" >
<!-- Delete the ${build} and ${dist} directory trees -->
<delete dir="${build}"/>
<delete dir="${dist}"/>
</target>
</project>
Notice that we are declaring properties outside any target. As of
Ant 1.6 all tasks can be declared outside targets (earlier version
only allowed <property>,<typedef> and
<taskdef>). When you do this they are evaluated before
any targets are executed. Some tasks will generate build failures if
they are used outside of targets as they may cause infinite loops
otherwise (<antcall> for example).
We have given some targets descriptions; this causes the projecthelp
invocation option to list them as public targets with the descriptions; the
other target is internal and not listed.
Finally, for this target to work the source in the src subdirectory
should be stored in a directory tree which matches the package names. Check the
<javac> task for details.
A project can have a set of tokens that might be automatically expanded if
found when a file is copied, when the filtering-copy behavior is selected in the
tasks that support this. These might be set in the buildfile
by the filter task.
Since this can potentially be a very harmful behavior,
the tokens in the files must
be of the form @token@, where
token is the token name that is set
in the <filter> task. This token syntax matches the syntax of other build systems
that perform such filtering and remains sufficiently orthogonal to most
programming and scripting languages, as well as with documentation systems.
Note: If a token with the format @token@
is found in a file, but no
filter is associated with that token, no changes take place;
therefore, no escaping
method is available - but as long as you choose appropriate names for your
tokens, this should not cause problems.
Warning: If you copy binary files with filtering turned on, you can corrupt the
files. This feature should be used with text files only.
You can specify PATH- and CLASSPATH-type
references using both
":" and ";" as separator
characters. Ant will
convert the separator to the correct character of the current operating
system.
Wherever path-like values need to be specified, a nested element can
be used. This takes the general form of:
The location attribute specifies a single file or
directory relative to the project's base directory (or an absolute
filename), while the path attribute accepts colon-
or semicolon-separated lists of locations. The path
attribute is intended to be used with predefined paths - in any other
case, multiple elements with location attributes should be
preferred.
Since Ant 1.8.2 the location attribute can also contain a
wildcard in its last path component (i.e. it can end in a
"*") in order to support wildcard CLASSPATHs introduced
with Java6. Ant will not expand or evaluate the wildcards and the
resulting path may not work as anything else but a CLASSPATH - or
even as a CLASSPATH for a Java VM prior to Java6.
As a shortcut, the <classpath> tag
supports path and
location attributes of its own, so:
In addition, one or more
Resource Collections
can be specified as nested elements (these must consist of
file-type resources only).
Additionally, it should be noted that although resource collections are
processed in the order encountered, certain resource collection types
such as fileset,
dirset and
files
are undefined in terms of order.
This builds a path that holds the value of ${classpath},
followed by all jar files in the lib directory,
the classes directory, all directories named
classes under the apps subdirectory of
${build.dir}, except those
that have the text Test in their name, and
the files specified in the referenced FileList.
If you want to use the same path-like structure for several tasks,
you can define them with a <path> element at the
same level as targets, and reference them via their
id attribute--see References for an
example.
By default a path like structure will re-evaluate all nested
resource collections whenever it is used, which may lead to
unnecessary re-scanning of the filesystem. Since Ant 1.8.0 path has
an optional cache attribute, if it is set to true, the path
instance will only scan its nested resource collections once and
assume it doesn't change during the build anymore (the default
for cache still is false). Even if you are using the
path only in a single task it may improve overall performance to set
cache to true if you are using complex nested
constructs.
A path-like structure can include a reference to another path-like
structure (a path being itself a resource collection)
via nested <path> elements:
In Ant 1.6 a shortcut for converting paths to OS specific strings
in properties has been added. One can use the expression
${toString:pathreference} to convert a path element
reference to a string that can be used for a path argument.
For example:
Several tasks take arguments that will be passed to another
process on the command line. To make it easier to specify arguments
that contain space characters, nested arg elements can be used.
Attribute
Description
Required
value
a single command-line argument; can contain space
characters.
Exactly one of these.
file
The name of a file as a single command-line
argument; will be replaced with the absolute filename of the file.
path
A string that will be treated as a path-like
string as a single command-line argument; you can use ;
or : as
path separators and Ant will convert it to the platform's local
conventions.
pathref
Reference to a path
defined elsewhere. Ant will convert it to the platform's local
conventions.
line
a space-delimited list of command-line arguments.
prefix
A fixed string to be placed in front of the
argument. In the case of a line broken into parts, it will be
placed in front of every part. Since Ant 1.8.
No
suffix
A fixed string to be placed immediately after the
argument. In the case of a line broken into parts, it will be
placed after every part. Since Ant 1.8.
No
It is highly recommended to avoid the line version
when possible. Ant will try to split the command line in a way
similar to what a (Unix) shell would do, but may create something that
is very different from what you expect under some circumstances.
Examples
<arg value="-l -a"/>
is a single command-line argument containing a space character,
not separate commands "-l" and "-a".
<arg line="-l -a"/>
This is a command line with two separate arguments, "-l" and "-a".
<arg path="/dir;/dir2:\dir3"/>
is a single command-line argument with the value
\dir;\dir2;\dir3 on DOS-based systems and
/dir:/dir2:/dir3 on Unix-like systems.
Any project element can be assigned an identifier using its
id attribute. In most cases the element can subsequently
be referenced by specifying the refid attribute on an
element of the same type. This can be useful if you are going to
replicate the same snippet of XML over and over again--using a
<classpath> structure more than once, for example.
All tasks that use nested elements for
PatternSets,
FileSets,
ZipFileSets or
path-like structures accept references to these structures
as shown in the examples. Using refid on a task will ordinarily
have the same effect (referencing a task already declared), but the user
should be aware that the interpretation of this attribute is dependent on the
implementation of the element upon which it is specified. Some tasks (the
property task is a handy example)
deliberately assign a different meaning to refid.
Ant supports a plugin mechanism for using third party tasks. For using them you
have to do two steps:
place their implementation somewhere where Ant can find them
declare them.
Don't add anything to the CLASSPATH environment variable - this is often the
reason for very obscure errors. Use Ant's own mechanisms
for adding libraries:
via command line argument -lib
adding to ${user.home}/.ant/lib
adding to ${ant.home}/lib
For the declaration there are several ways:
declare a single task per using instruction using
<taskdef name="taskname"
classname="ImplementationClass"/> <taskdef name="for" classname="net.sf.antcontrib.logic.For" />
<for ... />
declare a bundle of tasks using a properties-file holding these
taskname-ImplementationClass-pairs and <taskdef> <taskdef resource="net/sf/antcontrib/antcontrib.properties" />
<for ... />
declare a bundle of tasks using a xml-file holding these
taskname-ImplementationClass-pairs and <taskdef> <taskdef resource="net/sf/antcontrib/antlib.xml" />
<for ... />
declare a bundle of tasks using a xml-file named antlib.xml, XML-namespace and
antlib: protocoll handler <project xmlns:ac="antlib:net.sf.antconrib"/>
<ac:for ... />
If you need a special function, you should
have a look at this manual, because Ant provides lot of tasks
have a look at the external task page in the manual
(or better online)