Semantic Technology Conference notes
I attended the Semantic Technology Conference in San Jose last week (May 18-22). It was interesting to attend this conference so soon after I participated in JavaOne 2008 earlier in the month. At JavaOne I had many responsibilities (give a presentation, meet with customers and partners, attend sessions that I championed, ...) so I did not get to see much of the pavilion, none of the keynotes and only a few sessions. In other words, I worked. Whereas I had no responsibilities at the SemTech conference so I was able to attend many sessions, all the keynotes and visit the booths for an extended period.
I was great to meet up with my "semantic" buddies Henry Story and Dean Allemang. I first met them at Jazoon in Zurich last year. Dean has a great new book out: Semantic Web for the Working Ontologist that he coauthored with Jim Hendler. I highly recommend this book regardless of the level of your knowledge about the semantic web.
I took lots of notes which I had hoped to post day-by-day or at least cleanup and post. It's clear to me now that I'll never find the time for those activities, so here they are: unedited, unspellchecked and unformatted. I hope you find them useful.
Regards,
Harold
Contents
- Monday, May 19, 2008
- Semantic Wikis - Ontoprise, Dr. Jurgen Angele
- Tuesday, May 20, 2008
- Twine - Radar Networks, Nova Spivack
- Zepheira - Eric Miller
- Persistent Identifiers - Zepheira, David Wood
- Freebase - Metaweb, Jamie Taylor
- Federated Terminology Authoring Using Semantic MediaWiki
- What to Do with an OWL Reasoner: Introduction to Pellet - Clark & Parsia, Evren Sirin
- Smart Browser - AdaptiveBlue, Alex Iskold
- Wednesday, May 21, 2008
- Linked Data Panel
- Access Control Policies and their Use in Shared Security Services - Semantic Arts, Timothy Swanson
- Analyzing Web Access Control Policies Using Semantic Technologies - Clark & Parsia, Michael Smith
- Calais - Thomson Reuters, Thomas Tague
- Sindice - DERI, Dr. Giovanni Tummarello
- Rising Stars Panel
- Semantic Markup of Java Source Code - Build Software LLC, Brian D. Eubanks
- Thursday, May 22, 2008
- SA-REST - Kno.e.sis, Karthik Gomadam, Dr. Amit
- ELMO: Mapping Objects to RDF - James Leigh
- Panel - Hendler, Conner, Hall, ...
- TopBraid tutorial- Dean Allemang
Semantic Wikis, Prof. Dr. Jurgen Angele, CTO ontoprise
5pm search engine (e.g., google) - entrance to web 1.0 knowledge base wikis (e.g.,) wikipedia - entrance to web 2.0 knowledge base problems of web 2.0 wikis: - ad-hoc queries not possible - facts are inconsistent - summaries (e.g., lists) take high effort - problems in facts: not (automatically) detectable semantic wikis - entrance to web 3.0 (SW) knowledge base semantic wiki - ability to create ontologies www.ontoprise.de halo project checks: - holds: pop density = inhabit / country size - every country as capital - born before you die mediawiki halo-extension to Semantic Media Wiki app areas: project, quality, innovation, hr, content, knowledge management ability in import/export OWL characteristics: collab; struct+unstruct knowledge; content reuse; ad-hoc workflows; simple cost-effective impl impl - will expose sparql endpoint soon (now using ASK, need to replace). - ACL - will be adding rules engine (but derived rules + ACL tricky) - replacing MySQL storage with triple storage (Franz, ontobroker, ...) angele@ontoprise.de www.ontoprise.com has booth - with memory stick with running SMW on it.Tuesday May 20
Twine/Radar Networks - Nova Spivack
8:30am http://radarnetworks.com/ http://www.twine.com/ service share what you know about your interests (using the semantic web) funding timeline: darpa; paul allen; vulcan enable regular developers not SW savvy to be able to use SW beta - invite only facebook - your relationship (who) linkedin - your career (who) twine - your interests (communal knowledge) (what) interest tracking; info mgmt; online communities; collab organize; share; discover everything generated from an ontology they generate app from ontology (besides data being in ontology) but not limited to that ontology in fall will allow you to import or create your own ontology natural language understanding then tagging target customer: consumers/pros who need to access/share knowledge uses postgres+solaris for triple store written in Java anybody in twine has 10 invitations (ask for more if you run out) fall: public twines will be visible without password have a sparql api (not released yet) have a rest api (not release yet) will be able to export RDF funding advice: don't just be a platform - solve a problem
SW Tools - Zepheira, Eric Miller
8:30am Reuse, Repurpose, Remix em@zepheira.com read/write web - but initially web was mostly read blog made write much easier for "normal" people technorati analyzes/structures this info reuse of data - how easy is it? action:create/publish/analyze docs:blog/bloggger/technoratie music: remix: simile; amara; purl (purlz.org); aduna/zepheria; remix - exhibit mixer everything becomes a web resource (transforms; data; resource oriented arch) linked enterprise data sale it: solve "their" problem (underneath provide resources for future) business as a web: - web's problems are enterprise problems - silos, change, multiple parties, formats, much data - brittle - common addressing, linking, data frameworks Be "IN" the web, not just "ON" it. If you employees are most important asset, then empower them.
Persistent Identifiers for the "Real Web", Zepheira, David Wood, Eric Miller
10:30am
Cool URIs don't change.
One thing that makes a URI useful is that you can resolve it.
BAD URI: machine name, port, path, infrastructure (e.g, jsp, php)
metadata encoded in URL.
Real web: you, me, your laptop, this conference
Conceptual graph of relationships of the things in our lives.
We (almost) do this on the web.
Need to turn web of documents into web of structured data (conceptual graph).
*URL Curation* (PURL Server) : http://purl.org
Open source, open standards: purlz.org
Original motivation: change in host; hosting organization
Anything that breaks a link costs money.
MediaWiki has shorthand for PURLs
Impl:
Uses 1060 NetKernel
TYPES:
301 moved permanently
302 found
*303 see other* - tie to SW
307 temporary redirect
404 not found
410 gone
Difference between URI as non-resolvable identifier or a resolvable reference?
301/302 : information resource
303 : physical / conceptual resource (e.g., Moby Dick - see Wikipedia entry)
Identify resources by URIs
Use "See Also" PURLs to ensure cross-boundary data integrity
Dont' reinvent the wheel:
reuse common public and partner URIs before minting your own
Share a PURL service or use more than one
Coordinate at the info space level (more than code or APIs)
Local control, global access
Experiment with Active PURLs
Active PURL participates provision of metadata about resources it represents.
purlz.org
purl.org
en.wikipedia.org/wiki/Persistent_Uniform_Resource_Locator
zerpheria.com/talks/semtech2008-dwem.pdf
Freebase - Metaweb, Jamie Taylor
11:45am His team responsible in seeding database. An open, shared database of the world's knowledge Creative Commons Attribution License Open APIs Community built: collective editing; collaborative semantics Consensual reality: available data; people, places, products, ...; Called Topics (e.g., Pontifs, Art, Airplanes, Cheese, Tropical Storms) 3.4 million topics; 750K people; 450K locations; 50K companies; 40K movies; over 1000 types and 3000 properties Initially used some wikipedia articles as seeds of freebase db Not a formal system (e.g., Cyc, SUMO, True Knowledge, Halo) Not a reasoning engine Topics in Freebase are Unique (no two Topics represent the same thing) research.freebase.com
Federated Terminology Authoring Using Semantic MediaWiki
2:00pm Terminologies in Health Domain: SNOMED CT / IHTSDO; LOINC; ICF-10, DRGs ICF 9/10; CPT; HL7 Need them to - compare - aggregate - interchange - secondary uses - linkage to decision support services Terminologies centrally curated and distributed: - quality, consistency, accuracy consistency (including deleted an ID and never using for something else in the future) But slow to deal with user (i.e., the terminology experts) feedback (e.g., spelling errors, relationships between codes, descriptions) User best source of info regarding: problems, requirements, uses. Central assurance necessary but does not scale. Problem: how to distribute but maintain quality, consistency, accuracy. Solution: Used semantic media wiki SMW: collaborative ontology engineering
What to Do with an OWL Reasoner: Introduction to Pellet - Clark & Parsia, LLC, Evren Sirin
3:15pm
OWL-DL (all) reasoner (and much of OWL2)
Open source in Java: pellet.owldl.com
Bindings to Jena, OWL-API, Protege, TopBraid Composer
Next release dual licensed
Reasoning in OWL
Given set of assertions
- check consistency (no contradictions)
- infer new conclusions
Inference:
Penguin subClassOf Bird
Pablo type Penguin
-> Pablo type Bird
Inconsistency:
Bird subClassOf FlyingThing
Penguin subClassOf Bird
Penguin disjointWith FlyingThing
Obvious here, but when thousands of users, need to automate.
Features:
- consistency; classification (subclassing between classes);
realization (is instance)
- conjunctive sparql-dl queries
(combine schema and instance query)
- datatype reasoning
single property - e.g, small monitor has screen <- 15in
combination properties - e.g., widescreen has height width ration < .75
- SWRL Rules
DL-safe rules : applies only to instances
- Explaining and debugging
Hard to understand large complex ontologies
Pellet can answer:
Why is a certain subclass relation inferred?
Why is a certain ontology inconsistent?
OwlSight - ontology browser
(to demo pellet features, especially consistency, subclassing)
written in Google GWT
Uses of Pellet:
- data integration
describe data sources using ontologies
define mappings between
use reasoning to answer queries
- healthcare and life sciences
terminology development and axiomatization
decision support; intelligent user interfaces; info integration
NCI, SNOMED, GALEN, FMA, OBO, ...
- Service Oriented Architecture
input/output types
pre/post conditions
languages: SA-WSDL, OWL-S
Reasoner supports : matching requests with services
(semi) automated service composition
- Policy Analysis
languages: XACML, WS-Policy
managing is hard: detect security holes, change impact analysis
tools focus on policy enforcement (runtime) not analysis (design-time)
Use OWL reasoning to analyze
Map policy language to OWL (done for XACML and WS-Policy)
Analysis:
policy subsumption, redundancy, incompatibility, verification, querying
- Config management:
find set of components that satisfy requirements and constraints
- Probabilistic reasoning
Many places for uncertainty : uncertain taxonomic relationships,
facts can be uncertain
social network analysis; breast cancer risk assessment
e.g., A sameAs B with N% probability
*most* birds fly
clarkparsia.com
Q&A: Jess is forward-chaining
Prolog is backward-chaining
Pellet is neither.
It is Semantic tableau: add negation of query and search
Smart Browser - AdaptiveBlue, Alex Iskold
4:30pm Alex Iskold (Founder and CEO of AdaptiveBlue) www.adaptiveblue.com alex.iskold@gmail.com Much info on web for human - but not for machines. Need machine readable info/semantics for greater semantic scalability BlueOrganizer firefox addon to auto recognize subset of verticals (without needing metadata) pages, links, text unveils a layer of things on top of the web contextual browsing : what happens after search? - you know user is looking at book, address
5:50 - 7:45 Exhibits and Reception 7:30-9:30 Semantic Exchange ReceptionWednesday May 21
Linked Data Panel
8:30am Ralph Swick, W3C Danny Ayers, Talis (no show) Giovanni Tummarello, DERI Nathan Yergler, Creative Commons Principles of Naming (for HTML and RDF): - Use URIs as names for things - Use HTTP URIs so people can look up those names - When someone looks up a URI, provide useful info - Include links to other URIs so that they can discover more things SINDICE Semantic Sitemap Extension www.okkam.com
Access Control Policies and their Use in Shared Security Services - Semantic Arts, Timothy Swanson
9:45am
Message Passing architecture
"Can we use RDF and OWL to represent these messages"?
Security Terminology
- subject, object, operation : user/agent, being requested, CRUD
Motivation
+-> HR/DB
Bob -> dev server --> Security Roles/DB
but if Bob leaves company still has access to Roles/DB
Rule-based security
- Want security roles linked to HR/DB
OBAC = Ontology-Based Access Control
- Users that satisfy a set of conditions can perform operations on a system
subject -> subject
object + operation -> object
subject and object are nodes in RDF graph
Turtle notation:
[]
a :CheckoutRequest ;
:requestedResource :LibraryTermsOfService
...
XACML-DL - Analyzing Web Access Control Policies Using Semantic Technologies - Clark & Parsia, Michael Smith
9:45am Tim focused on market where ACLs have not been written. Feasible to start with RDF. C&P - focus on existing ACL (don't rewrite to RDF, but manage) XACML, WS-Policy - languages for expressing policy constraints - how to enforce at runtime - services explored by reduction to KR formalism - design time e.g., OWL auto-discovery of cross-cutting concerns iterative refinement Policy Analysis - denotes a set of "acceptable" things by describing them XACML - Access Control policies for distributed resources - Supports many features arbitrary attributes in policies express negative authorization conflict resolution algorithms - profiles for common methodologies Hierarchical Role Based Access Control XACML language - Policy is set of Rules - PolicySet can hold Policies or PolicySets - Combine algorithms enable modular: Permit-overrides, deny-overrides, first-applicable, ... - Access Requests - list of attribute/value pairs: subject, resource, action (operation), environment (context) Design-time Support - detection of security holes - change impact analysis Services: - Shallow testing: test subset of possible requests - Deep testing: all possible combinations - Comparison : - Verification : policy satisfy a particular property - Incompatibility - Redundancy - Querying Primary Difference between OBAC (Tim's above) - Decisions : R1-Target SDubClasssOf: R1-Permit - Combining algorithms XACML algos translated to class descriptions (R1-P or R2-P and not(R3-D) SubClassOf: P R3-D SubClassOf: D Challenges - Some aspects of policy analysis stretch OWL Non-monotonicity, built-ins (e.g., math, XPath) Change analysis and querying production/business rule like behavior - Goal to provide as much analysis as possible in OWL-DL then use other formalism (without being ad-hoc) Pint - XACML Policy Analysis Tool
Calais - Thomson Reuters, Thomas Tague
11:45am Calais (KahLay) 5 ways to improve content value (do cool stuff) with Calais Can handle formats:plain text, html, xml (may expose word and pdf in future) What is it? - a semantic metadata generation service - extracts entities, facts and events from unstructured text - is a web services with toolkits, frameworks, plugins, apps - available for commercial and non-commercial use - www.opencalais.com Why do they do this free service: - They keep and will leverage the extracted entities, fact, events - Trend analysis - Never expose individual document metadata level - only statistical - People will NOT get access to metadata collected (private) Use Calais to auto tag historical archives - Get improved search, navigation (e.g., powerhousemuseum.com) Use Calais to create microformat metadata for Yahoo! SearchMonkey (SW) - Better searchability and user experience - Get Calais API Key - Download marmoset (PHP) - Paste Marmoset into your site template - Wait for the monkey to visit (will then interact with Calais for tags) Use Calais to drive alerts or feeds based on events (not just keywords) - Get highly targeted notification of key events your users care about - detect and alert on significant events in your content Use Calais to enable semantic knowledge discovery - Get content insight - statistical analysis of document semantics - doc -> calais -> rdf -> flat file -> spreadsheet then ask Use Calais' growing tookkit of apps (many community contributed) - No coding - e.g., tagaroo (wordpress plugin), drupal (CMS), gnosis (firefox plugin - autotagging of entities on visited pages) Future: - open correct - expose PDF, word format handling (done internally but not exposed) - internal scoring but external its binary
Exhibits: textwise aduna sesame and autofocus (use for looking at nodes in cluster) Dean's book
Sindice.com - DERI, Dr. Giovanni Tummarello
2:45pm DERI, Galway, Ireland Sindice.com (CEO) - (SinDeeChe) Large Scale Searching, Publising and Remixing of Semantic Data on the Web Google Social API (microformats + RDF), OpenCalais (RDF) Twine (RDF), Freebase, ... RDF world: linking open data (LOD) URIs as entry points (http://dbpedia/resource/berline Advanced querying (http://wikipedia.3ba.se/ Microformats: hCalendar, HCards, HListing, XFN, hVotes, KELKOO, ... Competitive advantages by consuming the web of *data* - increase quantity/quality - profiling - social networking Sindice - locate structured data sources - semantic web pipes: remix data on the fly - semantic sitemaps: publish large amount of semantic data Find semantic sources which - talk of something with specific process SemanticWebPipes sindice.com/developers Use case : Jira + Beetle + extension = rdf
Rising Stars
4:15pm David Scott Lewis - moderator Barney Pell - Power Set Alex Iskold Nova Spivack Ian Davis Tom Gruber ------------------------- Barney Pell - Power Set Queries in natural language -> results -> NLP processing of search results -> structured data ------------------------- Alex Iskold - AdaptiveBlue "things in pages, links and text" brings semantics to existing web auto evolves with web itself facilitates contextual browsing focus on pragmatic simple consumer verticals ------------------------- Nova Spivack - Radar Networks - facebook doyop.com/twine ------------------------- Ian Davis - Talis heritage in large metadata systems: libraries, edu, ... talis.com/platform ------------------------- Tom Gruber - Stealth-Company.com what is killer SW app? search, collective intelligence, context browsing? You life (on-line) - intelligence at the interface language understanding, semantic search, machine learning, integrated services HTML/HTTP: user role: choose your path (link) system role: connect the dots - URI Portal: choose your channels deliver the content - frictionless broadcasting Search engine: state your query find relevant content and filter - ?? Integil at interf live your life tell me what i need to know, help me solve problems, meet needs, be proactive - personalized context-aware AI [ Carl Thompson SW tech journalist ] Get Carl's attention: Nova: Need to get past 100 billion triple barrier (need trillions/federated) Note: porn industry is generally early adopter of tech - but not SW (yet) Alex: iphone + SW (go to garden, point at tree, ask "what is it?) Tom: Don't push technology. What are the human needs? e.g., location graph + social graph Ian: Stick in domain you understand Don't focus on technology Use SW under the hood eg., travel (tripit) Barney: Search, advertising and publishing Plug vertical expertise into SW platform When will Gartner's of the world start paying attention? Ian: ?? Tom: Old analysts don't get it Barney: Gartner has been covering since 1997 But no magic quandrants New content might be better placed in context (rather than on site of person who created it) Nova: He has talked to the analysts but they don't think of it as A space They want to know how it will contribute to existing categories But does their opinion matter that much? They are playing catchup. Let's show great practical examples. Forget about SW and think products Barney: Predication: this conference will be twice the size next year
Semantic Markup of Java Source Code - Build Software LLC, Brian D. Eubanks
5:30pm Author Wicked Cool Java brian@buildsoftware.com Why? - Classes represent (sometimes) real-world concepts What are we processing? - Methods do things What are the semantics? What are inputs, outputs and assumptions? - Numbers represent real-world quantities Of what - metrics? money? time? Which units? - Apps interop with other systems Are we speaking same "language" What about developer's knowledge? - Need common ontology for Java concepts - UML is barely a start - Location of code directory of compiled classes jar / war / ear - Environment java version; target platform(s); required libs; app type (CLI, servlet...) licenses How do we obtain code? - analyze user requirements - existing code - orchestrate existing services - generate code - write new code Where do we run our code? - how do we execute: compile, package execute ... How to implement - Annotate - External RDF - Reasoning Annotations - metadata for Java classes - class, field, method level Annotation Benefits: - simpler object-relational mapping attach RDF metadata to database match RDF to annotated Java classes - any app can understand Java code semantics discovered by reading annotations generate RDF from Java code - semantic matching remove language, platform barriers describe business logic in RDF add intelligence External RDF: - URIs for compiled classes, jars, source,... - URIs for Java concepts External benefits (code/data separation): - can distinguish between source and lib - can describe build process - can describe interactions with other systems ... - e.g., find open source code that does X Reasoning - encode user requirements in RDF - discover code - orchestrate code - generate code Related APIs - jenabean.googlecode.com - www.openrdf.org (ELMO) - rdfreactor.semweb4j.org - sommer.dev.java.net - www.triplescape.com/doapamine RDF markup benefits - real-world mappings - map to RDBS - build management - code orchestration - code gen / translation - code search - business logic (rules) - business process reengineering - requirements analysis, data flow analysisThursday May 22
SA-REST - Kno.e.sis, Karthik Gomadam, Dr. Amit
SA-REST Why? - create *interoperable* REST services - smart mashups - device independent Foundations: - MREF, SA-WSDL MREF: - 1996/98 - Representing/Correlating info at meta/semantic level - abstraction on top of RDF and XML - href for logical relationships - virtual resource can be embedded in HTML or linked SA-WSDL (WSDL-S): - Add semantics to services - WSDL + modelreference = SAWSDL - SAWSDL ModelReference: Add semantic annotation to various parts of WSDL doc [ Slide 15 : WSDL picture ] SAWSDL - grounded to semantic meta-models - but independent of ontology / meta-model spec languages Lifting and Lowering - Systematic approach to data mediation - mediation at schema level - agree at meta level instead of syntax level - XSLT driven Microformats - humans first, machines second - simple open formats on existing standards - easy to add markups to POSH (Plain Old Semantic HTML) Microformat design patterns - abbr : human friendly and machine readable - class-desigin : indicate semantic meaning * rel-design : indicate meaning of a link SA-REST microformat approach - Add meaning to service descriptions - what message formats, what methods - semantic grounding to concepts : domain of an API; annotated inputs/outputs input/output - block markup - markups within this block relate to the input/output - pattern : class domain-rel: - domains of API - markup on body element - pattern : abbr method: - get/post - pattern : class p-lang-binding: - programming language binding - pattern : class sem-rel - describes a link in an API - an XSD schema link - pattern abbr sem-class - meta description of content in API - ALA SAWSDL modelref - pattern : abbr data-format - data format descriptor (XML, RSS/ATOM/ gdata...) - pattern : class protocol - soap/rest - pattern : class Vehicles - SA-REST elements can be used with RDFa - extract RDF via GRDDL Rules of thumb - unambiguously identify concept in meta-model Benefits - data mediation : like in SAWSDL - upcast/downcast - map service to Application Data Model (i.e., easier enable mashups) - smarter mashups meta level spec of mashups searching for service API in faceted manner code generation Taxonomies available for APIs - programmableweb.com : 55 categories - apihut.com/taxonomy : 60 categories, 4 facets knoesis.wright.edu/research/srl/standards.sarest www.w3c.org/2005/Incubator/swsc
ELMO: Mapping Objects to RDF - James Leigh
map Java objects --> RDF @rdf on interface/class; field; property RDF --> Java objects - rdf does contain have object "behavior" - need to associate/find behavior to match with instance data from RDF to create instance Sesame memory store - MemoryStore - good for < 1million triples - NativeSore - RDBMS RDF Store - Mulgara Store - billions Showed how to map to/from Java, Groovy, JavaScript, Ruby Benefits: - people with little RDF knowledge can be persisted into RDF store.
Panel
Hendler
pollcak/oracle
christine conner/dow jones
ivan/w3c
steve hall/value
gilmore
john
WHAT/HOW
Solve or add value to existing areas (don't lead with SW).
Simplify - but stick with what you know and where your passion lies.
Tools to help spread the technology.
How to connect disparate pieces of info (corporate wikis).
How do you connect customer data that lies inside corporate silos.
Semantics will destroy value before it will add value
e.g., just put out semantic tagged item saying "I am selling X"
reduces value of ebay
open data reduces value of ebay, linkedin, facebook
but then increases value of companies that embrace new open data model
Home run: figure out how to exploit open data.
non-deterministic web is a weakness
(fortune 500 companies need precise info when handling individual customers)
and a strength (trend analysis, suggestions, ...)
Steve: We would like to find and fund a semantic "SharePoint"
Security belongs with the data
TopBraid tutorial - Dean Allemang
RDF is a means for distributing/reconstituting data to/from the web DBS is used to interpret : what is the meaning of the value in row X / column Y XMLS is used to validate : structure Object Models define behavior RDFS is used for inference OWL model is used for ... subClassOf just means rdf:type inheritance subPropertyOf ...
- Login or register to post comments
- Printer-friendly version
- haroldcarr's blog
- 1977 reads






Comments
by inigo - 2008-06-06 01:45
I've found this post very useful - thank you Harold!I wasn't able to make it to the Semantic Technology Conference, but your notes have been helpful in letting me see some of the things I've missed and letting me see some of the people I should get in touch with.
by haroldcarr - 2008-06-04 13:14
Different sections of this blog post have been useful to some of my collegues and friends. I'm sorry you did not find it useful. You certainly have the option to ignore it. I'm surprised that if you did not find it useful you found the time to post a comment. Regards, Hby odoncaoa - 2008-06-04 13:05
Hope this isn't rude, but how is this 'stuff' supposed to be of value to anyone beside youself? Oh, wait a minute.... I get it. You're one of those brilliant kats, like Einstein, right? You're trying to get ahead of the note lookup curve, so that when you pass, and the rest of the science and engineering world figures out what a genius you were, they'll know where to look, yeah? ;^)