User Tools

Site Tools


org.gramar.storm.gramar:walkthrough

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
org.gramar.storm.gramar:walkthrough [2016/08/09 20:08]
chrisgerken
org.gramar.storm.gramar:walkthrough [2016/08/13 11:41] (current)
chrisgerken
Line 1: Line 1:
 +===== Walkthrough =====
 +
 This page walks through a sample usage of the org.gramar.storm.gramar **gramar**. ​ The specifics of the topology to be implemented aren't all that important. ​ Instead we'll focus on how to use the gramar to generate almost all of the topology implementation. This page walks through a sample usage of the org.gramar.storm.gramar **gramar**. ​ The specifics of the topology to be implemented aren't all that important. ​ Instead we'll focus on how to use the gramar to generate almost all of the topology implementation.
 +
 +==== Setup ====
 +
 +This walkthrough was run on a new Ubuntu VMware image with some additional installed software:
 +
 +  * Eclipse was downloaded and unzipped
 +  * From Eclipse, **Help --> Install New Software** to install
 +    * M2E from update site http://​download.eclipse.org/​releases/​mars
 +      * If this were a real development effort I'd probably also install EGit from here, too. 
 +    * Grammar from update site http://​gramar.org/​eclipse/​gramar
 +    * All gramars from update site http://​gramar.org/​eclipse/​gramars ​
 +  * In **Window --> Preferences --> Java --> Installed JRE's --> Execution Environment** select the installed JDK o be used for Java 1.8
 +  * From a terminal, use sudo apt-get to install the **graphviz** package. ​ Technically it's not required, but you need graphviz to produce the topology diagrams and the diagrams will be a key success factor in a real project.
 +    * For a real project, I'd also install git and maven  ​
 +
 +==== The Model ====
  
 We start with an empty Eclipse workspace (except for a general project named "​Sandbox"​). ​ We **New -> Other... -> Gramar -> Sample Model** and select the sample model for the gramar: We start with an empty Eclipse workspace (except for a general project named "​Sandbox"​). ​ We **New -> Other... -> Gramar -> Sample Model** and select the sample model for the gramar:
Line 176: Line 194:
 </​root>​ </​root>​
 </​code>​ </​code>​
 +
 +==== Generation ====
  
  Now that the model is complete, we'll generate the 98% or so of the topology implementation that's just boilerplate. ​ We select the model file in the Navigator (or Java Explorer) view, right-mouse-button click and select menu item **Apply Gramar...** to get the gramar application dialog:  Now that the model is complete, we'll generate the 98% or so of the topology implementation that's just boilerplate. ​ We select the model file in the Navigator (or Java Explorer) view, right-mouse-button click and select menu item **Apply Gramar...** to get the gramar application dialog:
Line 194: Line 214:
    
 You see from this screen shot that the gramar has created two projects. ​ One is a maven-ized Java project containing most of the final Storm implementation and the other project is there for advanced build needs. ​ We'll focus on the ingest-storm project, which gets its name from the @label attribute of the topology element in our model. ​ If we have used a label value of "​Fred",​ for example, the storm project would be named "​fred-storm"​. You see from this screen shot that the gramar has created two projects. ​ One is a maven-ized Java project containing most of the final Storm implementation and the other project is there for advanced build needs. ​ We'll focus on the ingest-storm project, which gets its name from the @label attribute of the topology element in our model. ​ If we have used a label value of "​Fred",​ for example, the storm project would be named "​fred-storm"​.
 +
 +=== Topology Diagram ===
  
 An interesting file you should be using early and often is the IngestTopologySummary file.  It contains source for a graphviz diagram that documents the topology. ​ Open a terminal, go to the root folder in this project and run the gen_graphics.sh script (generated by the gramar): An interesting file you should be using early and often is the IngestTopologySummary file.  It contains source for a graphviz diagram that documents the topology. ​ Open a terminal, go to the root folder in this project and run the gen_graphics.sh script (generated by the gramar):
Line 201: Line 223:
 Refresh Eclipse and open IngestTopologySummary.png:​ Refresh Eclipse and open IngestTopologySummary.png:​
  
-{{:​org.gramar.storm.gramar:​wt11.png|}}+{{:​org.gramar.storm.gramar:​wt12.png|}} 
 + 
 +This diagram reflects the topology that we defined in the model. ​ The other diagram shows that same thing, but also annotates the shape of the data on each of the streams: 
 + 
 +{{:​org.gramar.storm.gramar:​ingesttopology.png|}} 
 + 
 +==== Beans ==== 
 + 
 +Recall that we defined types to describe the fields (names, types and order) in each stream. ​ For each of those types, the gramar generates a Java bean in the **.bean** package: 
 + 
 +{{:​org.gramar.storm.gramar:​wt15.png|}} 
 + 
 +The generated bolts and spouts will emit actual Values objects to the streams and will execute actual Tuple objects from those streams, but the business logic that we'll be expected to write will work with these bean objects. ​ The class that marshals and unmarshals these beans from Tuples to beans to JSON to Values and back to beans is the Marshaller class in the **.util** package. 
 + 
 +==== Bolts and Spouts ==== 
 + 
 +The generated logic for bolts and spouts is factored into Storm-aware code in the actual bolt and spout classes (in the **.bolt** and **.spout** packages, respectively) and helper classes containing business logic in the **.logic** package. ​  
 + 
 +Other observations include: 
 + 
 +  * For each bolt and spout, there is a generated interface that defines the bolt or spout behavior that might be needed by the business logic. 
 +  * Each interface will have an ack() and fail() method, although their use for bolts and spouts is, of course, different. 
 +  * Each interface will define emit methods for each stream onto which the spout or bolt emits 
 +  * Each bolt helper class has a method for each stream from which the bolt reads. ​ Each of those methods takes the type of bean that defines the stream tuple shape and a reference to the interface for the bolt.  Use the interface reference to emit tuples, to ack and to fail. 
 +  * Each spout helper class has a nextTuples() method that takes a reference to the spout'​s interface. ​ Use this interface to emit tuples, either with a message ID for reliable topologies or without a message ID for unreliable topologies. 
 + 
 +==== Adding Your Business Logic ==== 
 + 
 +Let's take a look at how you'd modify a spout helper class to add your business logic. 
 + 
 +An important concept with gramars in general and this gramar in particular, is that there are portions of the generated code that you can modify such that if you re-apply the gramar to a modified model those changes you make will be preserved. ​ The general pattern is that pairs of begin/end comments delimit the regions whose contents are kept on a gramar re-apply. ​ For example, the declaration section below falls within a begin/end comment block, indicating that if we had declarations for this class that those declarations will be kept when we re-apply the gramar. And we will.  Many times. 
 + 
 +<​code>​ 
 + // Begin declarations 
 +  
 + private static final long serialVersionUID = 1L; 
 + 
 +        private static final Logger log = Logger.getLogger(APIReaderSpoutLogic.class);​ 
 +        private boolean written = false; 
 + 
 +          // This is added, but not set.  We'll assign this variable a value in the open() method 
 +           
 +        private MyMiddleWareClient client; 
 +         
 + // End declarations  
 + 
 +</​code>​  
 + 
 +The open() method is passed the Config, TopologyContext and a reference to the actual spout. ​ As generated, the Topology driver class (in the **.topology** package) reads the entire property file and moves all of the key/value pairs there into the Config. ​ Use these properties to construct any variables you'll need for the running of the spout. ​ Note that there'​s a close() method, too, if you have shut-down logic. Remember to keep all of your code between the begin/end comment pairs. 
 + 
 +<​code>​ 
 + 
 +    public void open(Map map, TopologyContext topologyContext,​ IAPIReaderSpout spout) { 
 + 
 + // Begin open() logic  
 +  
 + // End open() logic  
 + 
 +    } 
 + 
 +    public void close(IAPIReaderSpout spout) { 
 + 
 + // Begin close() logic  
 + 
 + 
 + // End close() logic  
 + 
 +    } 
 + 
 +</​code>​  
 + 
 +The nextTuple() method is where you put your code business logic. ​ In the example below, we've added three lines of code to construct an instance of RawMessage (the type which defines the shape of stream rawMessages) and then to emit that bean onto the rawMessages stream. ​ Generated logic in the spout class with marshal the fields in the RawMessage object into a Values object and then emit that values object to the stream. 
 + 
 +Note the generated emit() message for the stream. ​ There will be an explicitle names emit() message for each stream that the spout is defined to have. 
 + 
 +Note also the other spout lifecycle methods below. 
 + 
 +<​code>​ 
 + 
 +    public void nextTuple(final IAPIReaderSpout spout) { 
 + 
 + // Begin nextTuple() logic  
 +  
 +        try { 
 + 
 + // emit a tuple 
 + 
 +            String payload = "​...."; ​ // Get this value from the middleware 
 +            RawMessage rawMessage = new RawMessage(payload);​ 
 +            spout.emitToRawMessages(rawMessage,​ rawMessage);​ 
 + 
 +         
 +        } catch (Exception e) { 
 +       ​ log.error("​APIReaderSpoutLogic nextTuple() error: "+ e.toString());​ 
 +        } 
 + 
 + // End nextTuple() logic  
 + 
 +    } 
 + 
 +    public void open(Map map, TopologyContext topologyContext,​ IAPIReaderSpout spout) { 
 + 
 + // Begin open() logic  
 +  
 + // End open() logic  
 + 
 +    } 
 + 
 +    public void close(IAPIReaderSpout spout) { 
 + 
 + // Begin close() logic  
 + 
 + 
 + // End close() logic  
 + 
 +    } 
 + 
 +    public void activate(IAPIReaderSpout spout) { 
 + 
 + // Begin activate() logic  
 + 
 + 
 + // End activate() logic  
 + 
 +    } 
 + 
 +    public void deactivate(IAPIReaderSpout spout) { 
 + 
 + // Begin deactivate() logic  
 + 
 + 
 + // End deactivate() logic  
 + 
 +    } 
 + 
 +    public void ack(Object o, IAPIReaderSpout spout) { 
 + 
 + // Begin ack() logic  
 + 
 + 
 + // End ack() logic  
 + 
 +    } 
 + 
 +</​code>​
  
 +==== Evolving the Topology ====
  
 +When you write a Storm topology it will evolve. ​ You'll find yourself adding, removing, merging or splitting spouts and bolts. ​ The streams will change, too.  You'll add and remove streams, change their source or destination and change their fields and their formats. ​ Whenever you need to evolve your topology, you can simply update the original model to which you applied the gramar and apply the gramar again.
  
 +As long as you keep your code changes between the begin/end comment pairs you can make most of the above changes easily while keeping your business logic in place. ​ There are some special cases where you do have to be careful:
  
 +  * If you change the name of a stream, a new method will be generated in the receiving bolt's helper class. ​ Be sure to save the previous read method on the side so you can copy the logic after the gramar re-apply.
 +  * If you change the name (label) of a bolt or spout, a new set of classes will be generated according to the new name, but the classes for the previous label/name will still remain. ​ You have to hand copy the cope you want to keep. 
org.gramar.storm.gramar/walkthrough.1470773336.txt.gz · Last modified: 2016/08/09 20:08 by chrisgerken