I took lots of notes which I had hoped to post day-by-day or at least
cleanup and post. It's clear to me now that I'll never find the time
for those activities, so here they are: unedited, unspellchecked
and unformatted. I hope you find them useful.
Semantic Wikis, Prof. Dr. Jurgen Angele, CTO ontoprise
5pm
search engine (e.g., google) - entrance to web 1.0 knowledge base
wikis (e.g.,) wikipedia - entrance to web 2.0 knowledge base
problems of web 2.0 wikis:
- ad-hoc queries not possible
- facts are inconsistent
- summaries (e.g., lists) take high effort
- problems in facts: not (automatically) detectable
semantic wikis - entrance to web 3.0 (SW) knowledge base
semantic wiki
- ability to create ontologies
www.ontoprise.de
halo project
checks:
- holds: pop density = inhabit / country size
- every country as capital
- born before you die
mediawiki
halo-extension to Semantic Media Wiki
app areas: project, quality, innovation, hr, content, knowledge management
ability in import/export OWL
characteristics: collab; struct+unstruct knowledge; content reuse;
ad-hoc workflows; simple cost-effective impl
impl
- will expose sparql endpoint soon (now using ASK, need to replace).
- ACL
- will be adding rules engine (but derived rules + ACL tricky)
- replacing MySQL storage with triple storage (Franz, ontobroker, ...)
angele@ontoprise.de
www.ontoprise.com
has booth - with memory stick with running SMW on it.
SW Tools - Zepheira, Eric Miller
8:30am
Reuse, Repurpose, Remix
em@zepheira.com
read/write web - but initially web was mostly read
blog made write much easier for "normal" people
technorati analyzes/structures this info
reuse of data - how easy is it?
action:create/publish/analyze
docs:blog/bloggger/technoratie
music:
remix: simile; amara; purl (purlz.org); aduna/zepheria;
remix - exhibit mixer
everything becomes a web resource (transforms; data; resource oriented arch)
linked enterprise data
sale it: solve "their" problem (underneath provide resources for future)
business as a web:
- web's problems are enterprise problems
- silos, change, multiple parties, formats, much data
- brittle
- common addressing, linking, data frameworks
Be "IN" the web, not just "ON" it.
If you employees are most important asset, then empower them.
Persistent Identifiers for the "Real Web", Zepheira, David Wood, Eric Miller
10:30am
Cool URIs don't change.
One thing that makes a URI useful is that you can resolve it.
BAD URI: machine name, port, path, infrastructure (e.g, jsp, php)
metadata encoded in URL.
Real web: you, me, your laptop, this conference
Conceptual graph of relationships of the things in our lives.
We (almost) do this on the web.
Need to turn web of documents into web of structured data (conceptual graph).
*URL Curation* (PURL Server) : http://purl.org
Open source, open standards: purlz.org
Original motivation: change in host; hosting organization
Anything that breaks a link costs money.
MediaWiki has shorthand for PURLs
Impl:
Uses 1060 NetKernel
TYPES:
301 moved permanently
302 found
*303 see other* - tie to SW
307 temporary redirect
404 not found
410 gone
Difference between URI as non-resolvable identifier or a resolvable reference?
301/302 : information resource
303 : physical / conceptual resource (e.g., Moby Dick - see Wikipedia entry)
Identify resources by URIs
Use "See Also" PURLs to ensure cross-boundary data integrity
Dont' reinvent the wheel:
reuse common public and partner URIs before minting your own
Share a PURL service or use more than one
Coordinate at the info space level (more than code or APIs)
Local control, global access
Experiment with Active PURLs
Active PURL participates provision of metadata about resources it represents.
purlz.org
purl.org
en.wikipedia.org/wiki/Persistent_Uniform_Resource_Locator
zerpheria.com/talks/semtech2008-dwem.pdf
Freebase - Metaweb, Jamie Taylor
11:45am
His team responsible in seeding database.
An open, shared database of the world's knowledge
Creative Commons Attribution License
Open APIs
Community built: collective editing; collaborative semantics
Consensual reality: available data; people, places, products, ...;
Called Topics (e.g., Pontifs, Art, Airplanes, Cheese, Tropical Storms)
3.4 million topics; 750K people; 450K locations; 50K companies;
40K movies; over 1000 types and 3000 properties
Initially used some wikipedia articles as seeds of freebase db
Not a formal system (e.g., Cyc, SUMO, True Knowledge, Halo)
Not a reasoning engine
Topics in Freebase are Unique (no two Topics represent the same thing)
research.freebase.com
Federated Terminology Authoring Using Semantic MediaWiki
2:00pm
Terminologies in Health Domain:
SNOMED CT / IHTSDO; LOINC; ICF-10, DRGs
ICF 9/10; CPT; HL7
Need them to
- compare
- aggregate
- interchange
- secondary uses
- linkage to decision support services
Terminologies centrally curated and distributed:
- quality, consistency, accuracy
consistency (including deleted an ID and never using for
something else in the future)
But slow to deal with user (i.e., the terminology experts) feedback
(e.g., spelling errors, relationships between codes, descriptions)
User best source of info regarding: problems, requirements, uses.
Central assurance necessary but does not scale.
Problem: how to distribute but maintain quality, consistency, accuracy.
Solution: Used semantic media wiki
SMW: collaborative ontology engineering
What to Do with an OWL Reasoner: Introduction to Pellet -
Clark & Parsia, LLC, Evren Sirin
3:15pm
OWL-DL (all) reasoner (and much of OWL2)
Open source in Java: pellet.owldl.com
Bindings to Jena, OWL-API, Protege, TopBraid Composer
Next release dual licensed
Reasoning in OWL
Given set of assertions
- check consistency (no contradictions)
- infer new conclusions
Inference:
Penguin subClassOf Bird
Pablo type Penguin
-> Pablo type Bird
Inconsistency:
Bird subClassOf FlyingThing
Penguin subClassOf Bird
Penguin disjointWith FlyingThing
Obvious here, but when thousands of users, need to automate.
Features:
- consistency; classification (subclassing between classes);
realization (is instance)
- conjunctive sparql-dl queries
(combine schema and instance query)
- datatype reasoning
single property - e.g, small monitor has screen <- 15in
combination properties - e.g., widescreen has height width ration < .75
- SWRL Rules
DL-safe rules : applies only to instances
- Explaining and debugging
Hard to understand large complex ontologies
Pellet can answer:
Why is a certain subclass relation inferred?
Why is a certain ontology inconsistent?
OwlSight - ontology browser
(to demo pellet features, especially consistency, subclassing)
written in Google GWT
Uses of Pellet:
- data integration
describe data sources using ontologies
define mappings between
use reasoning to answer queries
- healthcare and life sciences
terminology development and axiomatization
decision support; intelligent user interfaces; info integration
NCI, SNOMED, GALEN, FMA, OBO, ...
- Service Oriented Architecture
input/output types
pre/post conditions
languages: SA-WSDL, OWL-S
Reasoner supports : matching requests with services
(semi) automated service composition
- Policy Analysis
languages: XACML, WS-Policy
managing is hard: detect security holes, change impact analysis
tools focus on policy enforcement (runtime) not analysis (design-time)
Use OWL reasoning to analyze
Map policy language to OWL (done for XACML and WS-Policy)
Analysis:
policy subsumption, redundancy, incompatibility, verification, querying
- Config management:
find set of components that satisfy requirements and constraints
- Probabilistic reasoning
Many places for uncertainty : uncertain taxonomic relationships,
facts can be uncertain
social network analysis; breast cancer risk assessment
e.g., A sameAs B with N% probability
*most* birds fly
clarkparsia.com
Q&A: Jess is forward-chaining
Prolog is backward-chaining
Pellet is neither.
It is Semantic tableau: add negation of query and search
Smart Browser - AdaptiveBlue, Alex Iskold
4:30pm
Alex Iskold (Founder and CEO of AdaptiveBlue)
www.adaptiveblue.com
alex.iskold@gmail.com
Much info on web for human - but not for machines.
Need machine readable info/semantics for greater semantic scalability
BlueOrganizer
firefox addon to auto recognize subset of verticals (without needing metadata)
pages, links, text
unveils a layer of things on top of the web
contextual browsing :
what happens after search? - you know user is looking at book, address
5:50 - 7:45 Exhibits and Reception
7:30-9:30 Semantic Exchange Reception
Linked Data Panel
8:30am
Ralph Swick, W3C
Danny Ayers, Talis (no show)
Giovanni Tummarello, DERI
Nathan Yergler, Creative Commons
Principles of Naming (for HTML and RDF):
- Use URIs as names for things
- Use HTTP URIs so people can look up those names
- When someone looks up a URI, provide useful info
- Include links to other URIs so that they can discover more things
SINDICE
Semantic Sitemap Extension
www.okkam.com
Access Control Policies and their Use in Shared Security Services
- Semantic Arts, Timothy Swanson
9:45am
Message Passing architecture
"Can we use RDF and OWL to represent these messages"?
Security Terminology
- subject, object, operation : user/agent, being requested, CRUD
Motivation
+-> HR/DB
Bob -> dev server --> Security Roles/DB
but if Bob leaves company still has access to Roles/DB
Rule-based security
- Want security roles linked to HR/DB
OBAC = Ontology-Based Access Control
- Users that satisfy a set of conditions can perform operations on a system
subject -> subject
object + operation -> object
subject and object are nodes in RDF graph
Turtle notation:
[]
a :CheckoutRequest ;
:requestedResource :LibraryTermsOfService
...
XACML-DL - Analyzing Web Access Control Policies Using Semantic Technologies - Clark & Parsia, Michael Smith
9:45am
Tim focused on market where ACLs have not been written.
Feasible to start with RDF.
C&P - focus on existing ACL (don't rewrite to RDF, but manage)
XACML, WS-Policy
- languages for expressing policy constraints - how to enforce at runtime
- services explored by reduction to KR formalism - design time
e.g., OWL
auto-discovery of cross-cutting concerns
iterative refinement
Policy Analysis
- denotes a set of "acceptable" things by describing them
XACML
- Access Control policies for distributed resources
- Supports many features
arbitrary attributes in policies
express negative authorization
conflict resolution algorithms
- profiles for common methodologies
Hierarchical Role Based Access Control
XACML language
- Policy is set of Rules
- PolicySet can hold Policies or PolicySets
- Combine algorithms enable modular:
Permit-overrides, deny-overrides, first-applicable, ...
- Access Requests - list of attribute/value pairs:
subject, resource, action (operation), environment (context)
Design-time Support
- detection of security holes
- change impact analysis
Services:
- Shallow testing: test subset of possible requests
- Deep testing: all possible combinations
- Comparison :
- Verification : policy satisfy a particular property
- Incompatibility
- Redundancy
- Querying
Primary Difference between OBAC (Tim's above)
- Decisions : R1-Target SDubClasssOf: R1-Permit
- Combining algorithms
XACML algos translated to class descriptions
(R1-P or R2-P and not(R3-D) SubClassOf: P
R3-D SubClassOf: D
Challenges
- Some aspects of policy analysis stretch OWL
Non-monotonicity, built-ins (e.g., math, XPath)
Change analysis and querying
production/business rule like behavior
- Goal to provide as much analysis as possible in OWL-DL
then use other formalism (without being ad-hoc)
Pint - XACML Policy Analysis Tool
Calais - Thomson Reuters, Thomas Tague
11:45am
Calais (KahLay)
5 ways to improve content value (do cool stuff) with Calais
Can handle formats:plain text, html, xml (may expose word and pdf in future)
What is it?
- a semantic metadata generation service
- extracts entities, facts and events from unstructured text
- is a web services with toolkits, frameworks, plugins, apps
- available for commercial and non-commercial use
- www.opencalais.com
Why do they do this free service:
- They keep and will leverage the extracted entities, fact, events
- Trend analysis
- Never expose individual document metadata level - only statistical
- People will NOT get access to metadata collected (private)
Use Calais to auto tag historical archives
- Get improved search, navigation (e.g., powerhousemuseum.com)
Use Calais to create microformat metadata for Yahoo! SearchMonkey (SW)
- Better searchability and user experience
- Get Calais API Key
- Download marmoset (PHP)
- Paste Marmoset into your site template
- Wait for the monkey to visit (will then interact with Calais for tags)
Use Calais to drive alerts or feeds based on events (not just keywords)
- Get highly targeted notification of key events your users care about
- detect and alert on significant events in your content
Use Calais to enable semantic knowledge discovery
- Get content insight
- statistical analysis of document semantics
- doc -> calais -> rdf -> flat file -> spreadsheet then ask
Use Calais' growing tookkit of apps (many community contributed)
- No coding
- e.g., tagaroo (wordpress plugin),
drupal (CMS),
gnosis (firefox plugin - autotagging of entities on visited pages)
Future:
- open correct
- expose PDF, word format handling (done internally but not exposed)
- internal scoring but external its binary
Exhibits:
textwise
aduna sesame and autofocus (use for looking at nodes in cluster)
Dean's book
Sindice.com - DERI, Dr. Giovanni Tummarello
2:45pm
DERI, Galway, Ireland
Sindice.com (CEO) - (SinDeeChe)
Large Scale Searching, Publising and Remixing of Semantic Data on the Web
Google Social API (microformats + RDF), OpenCalais (RDF)
Twine (RDF), Freebase, ...
RDF world: linking open data (LOD)
URIs as entry points (http://dbpedia/resource/berline
Advanced querying (http://wikipedia.3ba.se/
Microformats: hCalendar, HCards, HListing, XFN, hVotes, KELKOO, ...
Competitive advantages by consuming the web of *data*
- increase quantity/quality
- profiling
- social networking
Sindice
- locate structured data sources
- semantic web pipes: remix data on the fly
- semantic sitemaps: publish large amount of semantic data
Find semantic sources which
- talk of something with specific process
SemanticWebPipes
sindice.com/developers
Use case : Jira + Beetle + extension = rdf
Rising Stars
4:15pm
David Scott Lewis - moderator
Barney Pell - Power Set
Alex Iskold
Nova Spivack
Ian Davis
Tom Gruber
-------------------------
Barney Pell - Power Set
Queries in natural language
-> results
-> NLP processing of search results
-> structured data
-------------------------
Alex Iskold - AdaptiveBlue
"things in pages, links and text"
brings semantics to existing web
auto evolves with web itself
facilitates contextual browsing
focus on pragmatic simple consumer verticals
-------------------------
Nova Spivack - Radar Networks - facebook
doyop.com/twine
-------------------------
Ian Davis - Talis
heritage in large metadata systems: libraries, edu, ...
talis.com/platform
-------------------------
Tom Gruber - Stealth-Company.com
what is killer SW app? search, collective intelligence, context browsing?
You life (on-line) - intelligence at the interface
language understanding, semantic search, machine learning, integrated services
HTML/HTTP:
user role: choose your path (link)
system role: connect the dots
- URI
Portal:
choose your channels
deliver the content
- frictionless broadcasting
Search engine:
state your query
find relevant content and filter
- ??
Integil at interf
live your life
tell me what i need to know, help me solve problems, meet needs, be proactive
- personalized context-aware AI
[ Carl Thompson SW tech journalist ]
Get Carl's attention:
Nova:
Need to get past 100 billion triple barrier (need trillions/federated)
Note: porn industry is generally early adopter of tech - but not SW (yet)
Alex:
iphone + SW (go to garden, point at tree, ask "what is it?)
Tom:
Don't push technology.
What are the human needs?
e.g., location graph + social graph
Ian:
Stick in domain you understand
Don't focus on technology
Use SW under the hood
eg., travel (tripit)
Barney:
Search, advertising and publishing
Plug vertical expertise into SW platform
When will Gartner's of the world start paying attention?
Ian:
??
Tom:
Old analysts don't get it
Barney:
Gartner has been covering since 1997
But no magic quandrants
New content might be better placed in context
(rather than on site of person who created it)
Nova:
He has talked to the analysts
but they don't think of it as A space
They want to know how it will contribute to existing categories
But does their opinion matter that much?
They are playing catchup.
Let's show great practical examples.
Forget about SW and think products
Barney:
Predication: this conference will be twice the size next year
Semantic Markup of Java Source Code - Build Software LLC, Brian D. Eubanks
5:30pm
Author Wicked Cool Java
brian@buildsoftware.com
Why?
- Classes represent (sometimes) real-world concepts
What are we processing?
- Methods do things
What are the semantics?
What are inputs, outputs and assumptions?
- Numbers represent real-world quantities
Of what - metrics? money? time?
Which units?
- Apps interop with other systems
Are we speaking same "language"
What about developer's knowledge?
- Need common ontology for Java concepts
- UML is barely a start
- Location of code
directory of compiled classes
jar / war / ear
- Environment
java version; target platform(s); required libs; app type (CLI, servlet...)
licenses
How do we obtain code?
- analyze user requirements
- existing code
- orchestrate existing services
- generate code
- write new code
Where do we run our code?
- how do we execute: compile, package execute
...
How to implement
- Annotate
- External RDF
- Reasoning
Annotations - metadata for Java classes
- class, field, method level
Annotation Benefits:
- simpler object-relational mapping
attach RDF metadata to database
match RDF to annotated Java classes
- any app can understand Java code
semantics discovered by reading annotations
generate RDF from Java code
- semantic matching
remove language, platform barriers
describe business logic in RDF
add intelligence
External RDF:
- URIs for compiled classes, jars, source,...
- URIs for Java concepts
External benefits (code/data separation):
- can distinguish between source and lib
- can describe build process
- can describe interactions with other systems ...
- e.g., find open source code that does X
Reasoning
- encode user requirements in RDF
- discover code
- orchestrate code
- generate code
Related APIs
- jenabean.googlecode.com
- www.openrdf.org (ELMO)
- rdfreactor.semweb4j.org
- sommer.dev.java.net
- www.triplescape.com/doapamine
RDF markup benefits
- real-world mappings
- map to RDBS
- build management
- code orchestration
- code gen / translation
- code search
- business logic (rules)
- business process reengineering
- requirements analysis, data flow analysis
SA-REST - Kno.e.sis, Karthik Gomadam, Dr. Amit
SA-REST
Why?
- create *interoperable* REST services
- smart mashups
- device independent
Foundations:
- MREF, SA-WSDL
MREF:
- 1996/98
- Representing/Correlating info at meta/semantic level
- abstraction on top of RDF and XML
- href for logical relationships
- virtual resource can be embedded in HTML or linked
SA-WSDL (WSDL-S):
- Add semantics to services
- WSDL + modelreference = SAWSDL
- SAWSDL ModelReference: Add semantic annotation to various parts of WSDL doc
[ Slide 15 : WSDL picture ]
SAWSDL
- grounded to semantic meta-models
- but independent of ontology / meta-model spec languages
Lifting and Lowering
- Systematic approach to data mediation
- mediation at schema level
- agree at meta level instead of syntax level
- XSLT driven
Microformats
- humans first, machines second
- simple open formats on existing standards
- easy to add markups to POSH (Plain Old Semantic HTML)
Microformat design patterns
- abbr : human friendly and machine readable
- class-desigin : indicate semantic meaning
* rel-design : indicate meaning of a link
SA-REST microformat approach
- Add meaning to service descriptions
- what message formats, what methods
- semantic grounding to concepts : domain of an API; annotated inputs/outputs
input/output
- block markup
- markups within this block relate to the input/output
- pattern : class
domain-rel:
- domains of API
- markup on body element
- pattern : abbr
method:
- get/post
- pattern : class
p-lang-binding:
- programming language binding
- pattern : class
sem-rel
- describes a link in an API
- an XSD schema link
- pattern abbr
sem-class
- meta description of content in API
- ALA SAWSDL modelref
- pattern : abbr
data-format
- data format descriptor (XML, RSS/ATOM/ gdata...)
- pattern : class
protocol
- soap/rest
- pattern : class
Vehicles
- SA-REST elements can be used with RDFa
- extract RDF via GRDDL
Rules of thumb
- unambiguously identify concept in meta-model
Benefits
- data mediation : like in SAWSDL
- upcast/downcast
- map service to Application Data Model (i.e., easier enable mashups)
- smarter mashups
meta level spec of mashups
searching for service API in faceted manner
code generation
Taxonomies available for APIs
- programmableweb.com : 55 categories
- apihut.com/taxonomy : 60 categories, 4 facets
knoesis.wright.edu/research/srl/standards.sarest
www.w3c.org/2005/Incubator/swsc
ELMO: Mapping Objects to RDF - James Leigh
map Java objects --> RDF
@rdf on interface/class; field; property
RDF --> Java objects
- rdf does contain have object "behavior"
- need to associate/find behavior to match with instance data from RDF
to create instance
Sesame memory store
- MemoryStore - good for < 1million triples
- NativeSore
- RDBMS RDF Store
- Mulgara Store - billions
Showed how to map to/from Java, Groovy, JavaScript, Ruby
Benefits:
- people with little RDF knowledge can be persisted into RDF store.
Panel
Hendler
pollcak/oracle
christine conner/dow jones
ivan/w3c
steve hall/value
gilmore
john
WHAT/HOW
Solve or add value to existing areas (don't lead with SW).
Simplify - but stick with what you know and where your passion lies.
Tools to help spread the technology.
How to connect disparate pieces of info (corporate wikis).
How do you connect customer data that lies inside corporate silos.
Semantics will destroy value before it will add value
e.g., just put out semantic tagged item saying "I am selling X"
reduces value of ebay
open data reduces value of ebay, linkedin, facebook
but then increases value of companies that embrace new open data model
Home run: figure out how to exploit open data.
non-deterministic web is a weakness
(fortune 500 companies need precise info when handling individual customers)
and a strength (trend analysis, suggestions, ...)
Steve: We would like to find and fund a semantic "SharePoint"
Security belongs with the data