clojure - How do I write an Enlive selector to return "clusters" of tags? -


i'm writing clojure code using enlive process set of xml documents. they're in xml format borrows heavily html adds custom tags, , job convert them real html. custom tag that's bothering me right <tab>, being used in kinds of places shouldn't be. example, it's used make lists should have been made <ol> , <li>. here's example of kind of thing i'm encountering:

<p class="normal">some text</p> <p class="listwithtabs">(a)<tab />first list item</p> <p class="listwithtabs">(b)<tab />second list item</p> <p class="listwithtabs">(c)<tab />third list item</p> <p class="normal">some more text</p> <p class="anotherlist">1.<tab />another list</p> <p class="anotherlist">2.<tab />two items time</p> <p class="normal">some final text</p> 

i want turn into:

<p class="normal">some text</p> <ol type="a"> <li class="listwithtabs">first list item</li> <li class="listwithtabs">second list item</li> <li class="listwithtabs">third list item</li> </ol> <p class="normal">some more text</p> <ol type="1"> <li class="anotherlist">another list</li> <li class="anotherlist">two items time</li> </ol> <p class="normal">some final text</p> 

to this, need <p> elements contain <tab> descendants (easy enlive selectors), , somehow cluster them according natural groupings had in original xml documents (much harder).

i've looked through documents , determined can't rely on class attribute: these <p>-that-should-be-<li> elements have same class <p> elements around them, , there 2 successive groups of <p>-that-should-be-<li> elements same class each other (i.e., if example posted had both clusters having class listwithtabs). 1 thing think can rely on there never 2 different lists without @ least 1 non-list element separating them: in other words, cluster of successive <p> elements have property "has @ least 1 <tab> element descendant" part of same list.

with in mind, did experimenting @ repl, enlive loaded under namespace e (that is, (require '[net.cgrand.enlive-html :as e]) should assumed in effect rest of question). easy write selector pick out elements want, (e/select snippet [(e/has [:tab])]) returns list (well, it's lazy sequence) of 5 elements. want list of lists: first 3 elements , second two. vaguely (pardon non-standard indentation):

[   [{:tag :p, :content (... "first list item" ...)}    {:tag :p, :content (... "second list item" ...)}    {:tag :p, :content (... "third list item" ...)}   ] ; 3 items in first list   [{:tag :p, :content (... "another list" ...)}    {:tag :p, :content (... "with 2 items" ...)}   ] ; 2 items in second list ] 

i able create following selectors:

(def first-of-tab-group [(e/has [:tab])                          (e/left (complement (e/has [:tab])))]) (def rest-of-tab-group [(e/has [:tab])                         (e/left (e/has [:tab]))]) 

but i'm stuck. i'd (e/select snippet [[(e/start-at first-of-tab-group) (e/take-while rest-of-tab-group)]]), far know enlive doesn't have functions start-at , take-while.

it feels i'm close, missing 1 final key step. how take last step? how select "cluster" of elements match rules, omit other elements match same rules aren't part of first "cluster"?


Comments

Popular posts from this blog

Perl - how to grep a block of text from a file -

delphi - How to remove all the grips on a coolbar if I have several coolbands? -

javascript - Animating array of divs; only the final element is modified -